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Abstract 

We present a uniform construction for converting w-automata with arbitrary acceptance conditions to equiv- 
alent deterministic parity automata (DPW). Given a non-deterministic automaton with n states, our con- 
struction gives a DPW with at most 2*^*^" log") states and 0{-n?) parity indices. The corresponding bounds 
when the original automaton is deterministic are 0{n\) and 0(n), respectively. Our algorithm gives better 
asymptotic bounds on the number of states and parity indices vis-a-vis the best known technique when de- 
terminizing Rabin or Streett automata with f2(2") acceptance pairs, where n > 1. We demonstrate this by 
describing a family of Streett (and Rabin) automata with 2" non-redundant acceptance pairs, for which the 
best known determinization technique gives a DPW with at least $1(2'" states, while our construction con- 
structs a DRW/DPW with 2C("'i°s") states. An easy corollary of our construction is that an w-language 
with Rabin index k cannot be recognized by any w-automaton (deterministic or non-deterministic) with 
fewer than 0(\/fc) states. 

Keywords: w-automata. determinization, infinity sets 



1. Introduction 

The literature contains several interesting constructions for obtaining deterministic Rabin/parity au- 
tomata from nondeterministic w-automata with different accepting conditions [H 0> S [l3> 21 , fisl [iol . 0, 

n, m 0, [13, m 0, i- However, all known constructions are tailor-made to work for nondeterministic 

automata with a specific kind of accepting condition. For example, Safra's celebrated Biichi determinization 
construction (isl . [3| can be used to convert non-deterministic Biichi automata over words (NEW) to deter- 
ministic Rabin automata over words (DRW) . Piternian showed that Safra's construction can be augmented 
with additional machinery to obtain deterministic parity automata (DPW) over words from NE W [Til 0| . 
It requires the use of a completely different technique (once again, originally due to Safra [TBI . Il6l| and 
subsequently improved by Piterman [ll|) to convert non-deterministic Streett automata over words (NSW) 
to equivalent DRW or DPW. We are unaware of any construction for directly converting non-deterministic 
Miiller automata over words (NMW) to DRW or DPW. A two-step approach would involve first converting 
an NMW to NEW, and then using Safra's/Piterman's determinization construction for NEW to obtain a 
DRW/DPW. In this backdrop, we propose a uniform determinization construction for all w-automata for 
which the acceptance condition is based on infinity sets, i.e., the set of states visited infinitely often in a run 
of the automaton. It is worth noting that the acceptance conditions for all important classes of w-automata 
studied in the literature are based on infinity sets. 

We begin by quickly reviewing different acceptance conditions of cj-automata used in the literature. Let 
A = (S, Q, Qo, (5, 0) be a (possibly non-deterministic) w-automaton, where E is the alphabet, Q is the set 
of states, Qo ^ Q is the set of initial states, (5 : Q x E — 2*5 is the transition relation, and (j> is the 
acceptance condition. An acceptance condition based on infinity sets specifies properties of the set of 
states visited infinitely often in an accepting run of the automaton. Hence, (j) can be thought of as defining 
a predicate over 2*^. Formally, for every X C Q, we say P^{X) — True iff X, viewed as the infinity set of 
a run of A, satisfies the properties specified by 0. This is a re-statement of the fact that any cj-automaton 
with acceptance condition based on infinity sets can be converted to a Muller automaton by preserving 



Preprint submitted to Elsevier 



January 12, 2011 



the transition structure of the automaton and by hsting all subsets of states that satisfy in the MuUer 
acceptance set. We list below acceptance conditions of some important classes of w-automata and indicate 
the interpretation of in each case. In all cases, we assume that X is a subset of Q. 

• Biichi condition : (j) is given by F C Q, and P,p{X) = True iff X n F 7^ 0. 

• MuUer condition : (j) is given by a collection F = {Fi, F2, . . . , Fk}, where Fi C Q for all i S {1, . . . , k}, 
and P4,{X) = True iS X e F. 

• Rabin condition : (j) is given by a table of pairs T = {(^-i, Fi), (i?2, F2), . . . {Eh,Fh)}, where Ei,Fi C Q 
for all z G {1, . . . , h}, and P,p{X) = True iff there exists an i e {1, 2, . . . , /i} such that X n Fi = and 

xnF.^d}. 

• Streett condition: (p is given by a table of pairs, similar to that used for Rabin condition. However, in 
this case P^{X) = True iff for all i e {1 . . .h}, X n E, ^ whenever X n 7^ 0. 

• Parity condition: 4> is given by a sequence of sets F — (Fq, Fi, . . . Fh), where Fi C Q for all i G {0, . . . h}. 
Here, P4,{X) — True iff for some even number j e {0, . . . h}, X O Fj 7^ and for all m G {0, ... j — 1}, 

XnFrn^ 0. 

• Emerson-Lei condition [5]: is given by a fairness condition, expressed as a Boolean combination 
/ of special linear-time temporal logic formulae over atomic propositions labeling states of the ui- 
automaton. The sub-formulae of / are such that their truth can be determined simply by knowing 
the set of sets visited infinitely often along a path (or run) of the automaton, and from the labels of 
these states. Therefore, Pcji{X) = True iff every run of the automaton with infinity set X satisfies the 
temporal logic formula /. 

It follows from the above discussion that to determine if an w-word a is accepted by A, it suffices to 
determine the set of infinity sets for all runs of A on a, and to check if P^ evaluates to True for any of 
these infinity sets. This observation forms the basis of our construction for determinizing w-automata with 
arbitrary acceptance conditions based on infinity sets. 

The primary contribution of this paper is a uniform construction for converting cj-automata with 
arbitrary acceptance conditions based on infinity sets to deterministic parity automata. Given a non- 
deterministic automaton with n states, our construction gives a DPW with at most 2'-'*^" log") states and 
0{n^) parity indices. The corresponding bounds when the original automaton is deterministic are 0{n\) and 
0{n), respectively. Our algorithm gives better asymptotic bounds on the number of states and parity indices 
vis-a-vis the best known technique when determinizing Rabin or Streett automata with n{n^) acceptance 
pairs, where fc > 1. We demonstrate this by describing a family of Streett (and Rabin) automata with 2*^^"^ 
non-redundant acceptance pairs, for which the best known determinization technique gives a DPW with at 
least 2'^'^" ^ states and 2'-''^"') parity indices. An easy corollary of our construction is that an w-language with 
Rabin index k cannot be recognized by any cj-automaton (deterministic or non-deterministic) with fewer 
than 0{\/k) states. 

The remainder of this paper is organized as follows. We begin by revisiting Schwoon's version of Safra's 
NSW determinization construction and Piterman's optimization of it. We then describe our uniform con- 
struction for determinization of oj-automata along with intuition behind the construction and an example 
that demonstrates steps of the construction. We then prove the correctness of our construction and compute 
its complexity. Finally, we demonstrate the existence of a family of NSW for which our construction provides 
better upper bounds for determinization than any of the existing methods. 



2. Determinizing NSW: A Recap of Safra's and Piterman's Constructions 

Since our construction is obtained by adapting Safra's determinization construction for NSW 1^, 1^ and 
borrows some key optimization ideas from Piterman's construction [11], we provide an overview of Safra's 
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and Pitcrman's constructions below. Additional details of Safra's construction can be found in 1^ 16, 19| . 
and those of Piterman's construction can be found in [TTl . 

Safra's deterniinization construction for NSW is based on the idea of witness sets and hierarchically 
related decompositions. Since we will use a different notion of witness sets later in the paper, we will 
henceforth call witness sets as defined by Safra as Streett Safra witness sets. For a Streett automaton As — 
(E, Q, Qo, 6, (/)), the acceptance condition </> is given by a Streett pairs table T = {(^'i, Fi), ■ ■ ■ , {Eh, Fh)}- Let 
H = {1,2, . . . h} he the set of indices of Streett pairs in T. A subset J of is called a Streett Safra witness 
set for a run p of As iff for every j € J, some state in Ej is visited infinitely often in p, and for every j ^ J, 
no state in Fj is visited infinitely often in p. It is easy to see that every accepting run of As has at least 
one Streett Safra witness set, and any run of As with a Streett Safra witness set is an accepting run. Note, 
however, that an accepting run of As can have multiple Streett Safra witness sets. The decompositions used 
in Safra's construction can be viewed as hierarchically related processes, each of which tracks a subset of runs 
of As on a given word, and checks if a certain subset of iJ is a Streett Safra witness set for all the tracked 
runs. While Safra's original exposition (TBI . [l6| represents the hierarchy between decompositions using the 
notion of sub-decompositions, Schwoon's exposition of Safra's construction [191] explicitly represents the 
hierarchical relation between decompositions as a tree. Each node in this tree represents a decomposition as 
defined by Safra, and children of a node represent sub-decompositions in Safra's terminology. We will use 
the tree representation of decompositions, called (Q, iJ)-trees by Schwoon [1^], in the following discussion 
for clarity of exposition. 

Following the definition given by Schwoon fio'l, a {Q, iJ)-tree over ^5 is a finitely branching rooted tree 
with the following properties. 

• Every leaf node is labeled with a non-empty subset of Q (states of the Streett automaton As)- 

• State labels of leaf nodes are pairwise disjoint. 

• Every node is assigned a name from the set V = {1, 2, . . .2 ■ \Q\ ■ {\H\ + 1)}. 

• No two nodes have the same name. 

• Every edge is annotated with an element of _ff U {0}. 

• No edge annotation other than occurs more than once on any path from the root to a leaf. 

• Every non-leaf node has at least one child connected by an edge with a non-zero annotation. 

• The children of every node are ordered from left to right. 

Every node w in a (Q, _ff )-tree can be thought of as being associated with a Streett Safra witness set, 
W{v), defined as follows. If v is the root node, then W{v) — {1, 2 . . .h} = H . Otherwise, if v' is the parent 
of V and if the edge from v' to v is annotated with j, then W{v) — W{v') \ {j}. Let A('i;) denote the set 
of Streett states labeling the leaves of the sub-tree rooted at v. Thus, if w is a leaf node, \{v) is the state 
label of V. However, if v has children vi,V2, . ■ .vi, then v itself does not have a state label but \{v) is the 
disjoint union of \{vi), \{v2), ■ ■ . A(u;). A node t> in a {Q, i?)-tree represents a process that tracks the runs 
represented by states in A('i;), and checks if W{v) is a Streett Safra witness set for all these runs. This is 
done by waiting until all Ej for j € W{v) are visited in order along the runs, without visiting any Fi for 
I ^ W{v). If this happens, the process represented by v is said to have "succeeded"; it is then "reset" and 
the check starts all over again. Clearly, if the process represented by v is reset infinitely often, then W{v) 
is a Streett Safra witness set for the runs tracked by this process have, and hence these are accepting runs 
of As- On the other hand, if some state in Fi for I ^ W{v) is seen in a run being tracked by the process 
represented by v, then that run is removed from this process, and a new process is started for that run. The 
hierarchical relation between processes is explicitly represented by the parent-child relation between nodes 
in a (Q, _ff)-tree. Intuitively, if v' is the parent of v and if the edge from v' to v is annotated with j, the 
process represented by v tracks a subset of the runs tracked by v' after giving up hope that it will see a state 
from Ej ever in the future. While the parent v' keeps alive the hope that W{v') is the Streett Safra witness 
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set for all runs tracked by v' , the child v refines and corrects that hope by expectmg W{v) = W{v') \ {j} to 
be the Safra Streett witness set for the subset of runs tracked by v. 

The DRW obtained by applying Safra's construction to a Streett automaton As — {^,Q,Qo,S,(t)) is 
given by Ar — (S, , Qq, S^, 0''), where is the set of all {Q, iJ)-trees over As, and Qq is a singleton set 
containing the (Q,H)-tTee consisting of only a root node with name 1 and labeled with Qq (set of initial 
states oi As)- Since Ar is a deterministic automaton, 6"^ can be thought of as a function that takes a state 
(i.e., (Q,iJ)-tree) t and a letter ct € E and returns the next state (i.e., {Q,H)-tree) t' . The computation 
of t' from t and a is detailed in algorithm Safra Next given below (adapted from Schwoon's exposition fiot 
and Piterman's correction [11] of an erroneous step in [16, 19]). Note that algorithm SafraNext calls a 
recursive procedure Safra NextRecursive that is parameterized by the root node of a (Q, i7)-sub-tree and the 
corresponding Streett Safra witness set. If \Q\ = n and \H\ = h, the Rabin acceptance condition 0'" is given 
by a table T'^ = {{El, F^) \ I < i < 2 ■ n ■ {h + 1)}, where E'l is the set of aU {Q, iJ)-trees with no node 
named i, and is the set of all (Q, _ff)-trees in which a leaf node named i occurs. 



Algorithm : SafraNext 



Input: t : {Q,H)-tree over As, cr : letter in S 
Output: t' : {Q, H)-tree over As 



1. [Initialization] For every leaf node u of t, set the state label of u to S{X{u),a). 

2. [Recursive transformation] Let root be the root node of t. 
Invoke SafraNextRecursive(roo<, H). 

3. Return <' as the {Q, iJ)-tree rooted at root. 



End Algorithm : SafraNext 



Algorithm : SafraNextRecursive 



Input: V : root of a (Q, i7)-sub-tree, J : subset of H 
Output: t' : Transformed (Q, iJ)-sub-tree rooted at v 



1. If w is a leaf and J — return t' as the (Q, iJ)-sub-tree rooted at v. 

2. If w is a leaf and J 7^ 0, create a new child v' of v with state label A(u), remove \{v) from the state 
label of V (since v is no longer a leaf) and annotate the edge from v to v' with maxW^(t;). Assign an 
unused name from F = {1, 2, . . . 2 • |Q| • (\H\ + 1)} to v' . 

3. If, after the execution of Steps (|T]) and ([5]), w is not a leaf, then let ui, . . . ,w/ be the children of v 
ordered from left to right. Let the edge from v to Vi be annotated with ji for all i G {1, 2, . . . I}. 

(a) For all i from 1 to I, invoke SafraNextRecursive(t>i, J \ {ji}) 

(b) For every child Vi of v and every q G \{vi), do the following 

i. If (7 G Fj., remove q from the state labels of all leaves of the sub-tree rooted at Vi, create a 
new rightmost child v' of v with state label {g}, and annotate the edge from v to v' with ji. 
Assign an unused name from V = {1, 2, . . . 2 • IQj • (|_ff| + 1)} to v' . 

ii. If g G Ej., create a new rightmost child v' of v with state label {q] and annotate the edge 
from V to v' with max {{J U {0}) n {0, 1, . . . ji — 1}). In other words, the edge is annotated 
with the largest integer less than ji but in J, if it exists. Otherwise, it is annotated with 0. 
Assign an unused name from V = {\,2, . . .2 ■ \Q\ ■ {\H\ + 1)} to v' . 
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4. Let wi, U2, ■ • ■ , Wi' be the children of v after the above steps. Let ji,j2,...,jV be the annotations 
of the corresponding edges from v to its children. For every q € X{vj) D X{vk), where j ^ k and 
j, k € {1,2,..., Z'}, do the following. 

(a) If ji < jk, remove q from the state labels of all leaves of the sub-tree rooted at Vk- 

(b) If ji = jk and Vi is to the left of Vk, remove q from the state labels of all leaves of the sub-tree 
rooted at Vk- 

5. For every descendant it of w such that \{u) 0, delete u and all its descendants. 

6. If, after the previous steps, all edges from v to its children are annotated with 0, then the process 
represented by v has "succeeded" and needs to be "reset". Let S = \{v). Make v a leaf node by deleting 
all its children and their descendants, and set the state label of v to S. 

7. Return t' as the (Q, iJ)-sub-tree rooted at v. 



End Algorithm : SafraNextRecursive 



It was shown by Safra that given an NSW with \Q\ = n and \H\ = h, the above construction gives a 
deterministic Rabin automaton with 2'^(" '* '°s(" '') states and Q( n • h) Rabin acceptance pairs. Although 
a proof of correctness of the construction was provided in (isl. [la. [l9| . Piterman pointed out a minor error 
in the construction and rectified it in [llj. Fortunately, Piterman's correction affects only a single step of 
Safra's construction and does not change the asymptotic count of states or Rabin acceptance pairs. The 
fact that this erroneous step evaded the scrutiny of researchers for almost 14 years is testimony to the 
intricate nature of arguments used in Safra's construction. Piterman also proposed an adaptation of Safra's 
construction that uses only n ■ {h + 1) names (instead of 2 • n • • • (ft, -I- 1) names used by Safra) and gives a 
deterministic parity automaton with 2'^(" '' '°s(" '0) states and 2 ■ n - h parity indices. Currently, Piterman's 
construction is the best known determinization construction for NSW. 

Piterman's adaptation of Safra's construction involves two key ideas: (i) a new strategy for naming nodes, 
and (ii) addition of two integer-valued components, e and /, to every state of the constructed automaton 
that allows a parity acceptance condition to be defined. In the new naming strategy, whenever a new node 



is created in steps (3(b)i) or (3(b)ii) of algorithm SafraNextRecursive, it is assigned the smallest name 
larger than all names used so far in the construction of t' from t. In addition, after algorithm SafraNext has 
finished computing t' , a name-compaction step is performed. In this step, for each node v with name i in t', 
we determine the count, rem{v), of nodes that were removed during the construction of t' from t and had 
names less than i. The name of v is then reduced from i to i — rem{v). This ensures that there are no gaps 
in the set of names assigned to nodes in a {Q, iJ)-tree after the name-compaction step. Piterman's naming 
strategy also ensures that the name of a node v is less than that of node u iff w was created before u. Since 
the name of a node that stays back in a run (sequence of {Q, i?)-trees) can only reduce finitely many times, 
it follows that all nodes that eventually stay back in a run get fixed names that are smaller than the names 
of all other nodes that keep getting created and removed. 

The new state components e and / in Piterman's construction keep track of the smallest name of a node 
removed and the smallest name of a node that represents a successful process (see step (jB]) of algorithm 
SafraNextRecursive) respectively in the construction of t' from t. A state in the resulting automaton is 
therefore a (Q, H)-tree coupled with a pair of integers e, / G {1, . . . n • (ft, + 1) + 1}, with the restriction that 
the root node is always named 1 and all nodes are assigned names from {1, . . . n • (ft -|- 1)}. Piterman calls 
these states compact Streett Safra trees over As^ and obtains a deterministic parity automaton by defining 
a parity acceptance condition as follows. Let D denote the set of all compact Streett Safra trees over ^^5. 
Piterman's parity acceptance condition is given by = {Fq, Fi, . . . i^2r?j-i, where m = 2 • n • (ft + 1) and FiS 
are defined as follows. 

• Fo = {de D \ f a.nd e > 1} 

• F2t+i = {d e D I e = i + 2 and / > e}, for aU i G {0, . . . m - 1} 

• = {d e D I / = i + 2 and e > /}, for alH e {0, . . . TO - 2} 
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A proof of correctness of the above construction is given in [ll|. It is also shown there that the DPW 
obtained using this construction has at most 2 • n" • (/c + • (n • (fc + 1))! states and 2 • n • (fc + 1) 

parity indices. 

3. A uniform determinization construction for a;-automata 

We now describe a construction for converting w-automata with arbitrary acceptance conditions based 
on infinity sets to deterministic parity automata. Our construction can be viewed as an adaptation of 
Safra's NSW determinization construction that works for arbitrary acceptance conditions. As part of our 
construction, we use Piterman's naming strategy and his idea of using e, / components of states to get a 
parity acceptance condition. Interestingly, although our construction is based on Safra's and Piterman's 
constructions, we are able to sharpen the asymptotic upper bound for Streett and Rabin determinization 
beyond those obtainable by Safra's and Piterman's constructions. 

Let A — (E, Q, Qoi 0) be an oj-automaton, where is an arbitrary acceptance condition based on 
infinity sets. Let denote the predicate corresponding to 0. Without loss of generality, we will assume 
that Q — {qi,q2, ■ . -Qn}, where n = \Q\. For notational clarity, we will henceforth refer to states of A as 
^-states, and use [p] to denote the set {1, 2, . . .p} for every natural number p > 0. For every W C [n], we 
also define Qw to be the set {qi \ Qi G Q,i ^ W}. 

Motivated by the role played by Streett Safra witness sets in Safra's NSW determinization construction, 
we now define generalized witness sets for cj-automata with arbitrary acceptance conditions based on infinity 
sets. 

Definition 1 (Generalized Witness Set). A set W C [n] is a generalized witness set for a run p of A 

iff inf{p) = Qw and P,j,{Qw) = True. 

Note that Streett Safra witness sets are distinct from generalized witness sets even when ^ is a Streett 
automaton. By definition, a Streett Safra witness set is a subset of indices of Streett acceptance pairs, 
while a generalized witness set is a subset of indices of ^-states. Thus, if A has n states and h pairs in its 
acceptance table, and \i n ^ h (examples of NSW with this property are given in Section [5]), there can be 
many more Streett Safra witness sets than generalized witness sets. The situation is reversed if ft, ^ n. It 
follows from the definition above that a run p oi A can have at most one generalized witness set, although 
it may have multiple Streett Safra witness sets. Furthermore, the generalized witness set of p uniquely 
determines inf{p), while a Streett Safra witness set for p does not necessarily determine inf{p) uniquely. 
Finally, if ^ is a Streett automaton and if a run p oi A has a generalized witness set, then it has at least 
one (and perhaps more) Streett Safra witness sets. Conversely, if p has at least one Streett Safra witness 
sets, then it has exactly one generalized witness set. 

The use of generalized witness sets allows us to adapt Safra's construction to obtain a uniform deter- 
minization construction for w-automata with arbitrary acceptance conditions. We detail this construction 
in the following subsections. 

3.1. Intuition 

The intuition behind our construction parallels that behind Safra's NSW determinization construction, 
with some key differences stemming from the use of generalized witness sets instead of Streett Safra witness 
sets. The overall idea is to construct a deterministic automaton that simulates all runs of A on an w-word 
a, and uses a Rabin acceptance condition to simultaneously identify the set of state indices in the inf -set 
of a run and check if this set is a generalized witness set. The construction of the Rabin automaton can be 
adapted to give a deterministic parity automaton using techniques employed by Piterman Although 
there are an exponential number of potential generalized witness sets, we use Safra's idea of building a 
process decomposition (represented as a tree), in which each process tracks a subset of runs and checks if a 
given subset of ^-state indices is a generalized witness set for these runs. Using the same reasoning as used 
by Safra, we can show that only a polynomial number of generalized witness sets need to be examined at 
any time in order to determine if a run has a generalized witness set. 
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As in Safra's and Piterman's constructions [l^ 16, [l^ 11 1, each state of the DPW obtained by our 



construction is a tree of hierarchicahy related processes, with additional book-keeping information. The 
process represented by a node in the tree tracks a subset of runs of the automaton A. Each process is also 
associated with a set of indices of ^-states, called the hope set for the process. A process hopes that its 
hope set gives the indices of states in the inf -set of all runs tracked by it. This is checked by waiting for 
all states with indices in the hope set to be visited in turn by every run tracked by the process, without 
visiting any state with index outside the hope set. If this happens, the process is said to have "succeeded" 
locally; it is then "reset" and the check starts all over again. Clearly, if the process represented by a node v 
is reset infinitely often, its hope set gives the indices of states in the inf -set of all runs tracked by it. If, in 
addition, the set of states with indices in the hope set causes to evaluate to True, the hope set must be 
a generalized witness set of all runs tracked by the process. In this case, there exists at least one accepting 
run of A on the input word. On the other hand, if some state with an index outside the hope set is seen in a 
run tracked by a process, the corresponding run is removed from the process, and a new process is initiated 
for that run. As in Safra's and Piterman's constructions, we use an acceptance condition that checks for 
the existence of a node u that is eventually never deleted in the sequence of trees (states) in an infinite run 
of the constructed automaton, but is reset infinitely often. Unlike Safra's and Piterman's construction, we 
also require that the hope set of the process corresponding to node u be such that the corresponding set 
of „4-states renders True. In the remainder of the discussion, we will refer to a node and the process 
represented by it interchangeably when there is no confusion. 



3.2. The determinization construction 

Piterman used compact Streett Safra trees to represent states of the deterministic parity automaton in 
his NSW determinization construction [ll|. We follow the same approach and use a variant of compact 
Streett Safra trees, called compact generalized Safra trees, or CGS trees. Formally, a CGS tree t over A — 
(Q, S, Qo: ^, 0) is a 9-tuple {N, M, r,p. A, /i, e, /), where 

• is the set of nodes. 

• M : N ^ [IQP + IQI + 1] is the naming function. 

• r is the root node. 

• p : N ^ N is the parenthood function defined for N \ {r}. Thus, p{v) is the parent oi v G N \ {r}. 

• A : A — )■ 2*5 is a state labeling function that associates a subset of Q with each node. The state label 
of every node is equal to the union of state labels of its children. Furthermore, the state labels of two 
siblings are disjoint. 

• h : N ^ 2[I'5I1 is an annotation of nodes with a subset of [\Q\]. The root is always annotated with [\Q\]. 
The annotation of every node is contained in that of its parent and differs by atmost one element from 
the annotation of its parent. Every non-leaf node v has at least one child with an annotation that is 
a strict subset of h{v). For a node v with annotation J and child v' with annotation J' = J \ {j}, we 
will say that the edge from v to v' is annotated with j. If J' = J, we will say that the edge from v to 
v' is annotated with 0. 

• e, / e [IQP + IQI + 2] are two integers used to define the parity acceptance condition. 

Note that CGS trees differ from compact Streett Safra trees fll'l only in the annotation of nodes. In a 
compact Streett Safra tree, each node is annotated with a potential Streett Safra witness set, while in a 
CGS tree, the annotations are potential generalized witness sets. As discussed earlier, generalized witness 
sets can differ significantly from Streett Safra witness sets even when ^ is a Streett automaton. Intuitively, 
each node v in a compact generalized Safra tree represents a process that tracks the runs of A currently 
represented by A(u), and hopes that Q^^) is the inf -set of these runs. The set h{v) may therefore be viewed 
as the hope set for the process represented by v. 
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Given A = (S, Q, Qq, 6, (/)), we now construct a deterministic parity automaton (DPW) 2? = (S, T, to, 6^, V) 
such that L{A) = L{'D). In the following, we assume that n = \Q\ and m — |Qp + |Q| + 1. The different 
components of V are as defined below. 

• T is the set of all CGS trees over A. 

• to is the CGS tree with a single (root) node tq, with A(ro) — Qo, M{ro) = 1 and /i(ro) — [n]. For io, 
we set e = / = m + 1 . 

• The parity acceptance condition V = {Fq, Fi, . . . , i^2m-i) is defined in the same manner as done by 
Piterman dH. Specifically, 

- Fo=={ter|/-l,e>l} 

- F2^+l = {teT\e = t + 2J> 

- F2^+2 -{ter|/-i + 2,e> 

- F2,n-i = {teT\eJ>m} 

For reasons to be seen later, no CGS tree that arises in our construction can have e = 1; hence CGS trees 
with e = 1 are excluded from the Fi sets defined above. 

• 5^ is a deterministic transition function that returns a unique next state (CGS tree) t' for every current 
state t Cz T and input symbol cr e E. The computation of t' from t and a is accomplished by invoking 
algorithm GeneralizedNext(i, cr), as detailed below. 

Recall that a CGS tree has named, state-labeled and annotated nodes hierarchically arranged as a rooted 
tree, along with two integer valued components named e and /. Computing t' from t and a therefore 
involves transforming the hierarchical arrangement of nodes and determining new values for e and /, in 
general. Component e of t' is intended to record the smallest name of a node that was deleted during the 
transformation of the hierarchical arrangement. Similarly, component / is meant to record the smallest name 
of a node that was "reset" (in the sense described in Section [XT| . had a hope set such that the corresponding 
set of A states satisfies P^, and was not deleted subsequently during the transformation of the hierarchical 
arrangement. Since a node can be deleted in a step after being reset, algorithm GeneralizedNext uses a set U to 
remember all nodes that were reset and had hope sets such that the corresponding set of A states satisfies P^, 
in some step during the transformation. Finally, component / is set to the smallest name of a node in U that 
survives the transformation. The task of transforming the hierarchical arrangement of nodes is accomplished 
by invoking algorithm Genera I izedNextRecursive, as described below. As the transformation proceeds through 
recursive calls to GeneralizedNextRecursive and nodes are reset and/or deleted from the CGS tree, component 
e and the set U described above are updated. After the transformation of the hierarchical arrangement is 
completed, a name-compaction step is performed on the nodes of the resulting CGS tree in the same way as 
is done in Although intermediate steps of algorithm GeneralizedNextRecursive may use names of nodes 
outside the set [to], the name-compaction step ensures that all names used in the final CGS tree t' are within 
[m]. The pseudocode of algorithms GeneralizedNext and GeneralizedNextRecursive are presented below. 



Algorithm : GeneralizedNext 



Input: t : CGS tree over A, a : letter in E 
Output: t' : CGS tree over A 



1. [Initialization] Initialize e and / to to + 1. Initialize U to 0. For every node u in t, set X{u) to 

2. [Recursive transformation] Let root be the root node of t. 
Invoke Generalized NextRecursive(roof). 
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e} for < i < m ^ 1 
/} for < i < TO — 1 



3. [Name-compaction] Let t be the CGS tree rooted at root after Step ©. Let Z be the set of CGS tree 
nodes removed during the execution of Step For every node u in t, let rem(u) = \{u' E Z \ M{u') < 
M{u)}\ . Update M{u) to M{u) - rem(u). 

4. [Updation of component /] Let t be the CGS tree rooted at root that resufts after Step ([3]). Let 
N be the set of nodes in t. Set / to the minimum of its current value and min{M(i;') \ v' E U O N}. 

5. Return t' as the CGS tree rooted at root with e and / components as calculated above. 



End Algorithm : GeneralizedNext 



Algorithm : GeneralizedNextRecursive 



Input: V : root of a CGS sub-tree 

Output: t : Transformed CGS sub-tree rooted at v, updated values of e and U 



1. If w is a leaf and h{v) = 0, return t' as the CGS sub-tree rooted at v. 

2. If u is a leaf and h{v) ^ 0, create a new child v' of v. Set A(w') = h(v'') — h{v) \ {ma,x{h{v))} and 
M{v') to the smallest name greater than all names already used. Note that this may require using 
names not in [m]. 

3. If, after the execution of Steps ([Ij and ([2]), is not a leaf, then let vi, . . . ,vi be the children of v 
ordered according to their names. Let ji, . . .ji be indices such that ji = max{{h{v) U {0}) \ h(vi)) 0. 
As discussed earlier (in the definition of compact generalized Safra trees), we will say that the edge 
from V to Vi is annotated with ji. 

(a) For all i in 1 through Z, invoke GeneralizedNextRecursive(tii) 

(b) For every child Vi of v and every q £ \{vi), do the following. 

i. li q — qj. then create a new child v' of v. 

Set A(i;'j = {q}, h{v') = ) \ {max U {0}) n {0, 1, 2, . . . , - 1})}. The edge from 

V to v' is thus annotated with the largest integer smaller than ji but in if it exists. 

Otherwise, the edge is annotated with 0. Set M{v') to the smallest name greater than all 
names already used. 

ii. If g 7^ qji and q ^ Qh(vi) = {lj I Ij ^ Qij ^ h{vi)}, remove q from X{vi) and also from \{u) 
for all descendants u of Vi. 

4. Let vi,V2, ■ ■ ■ ,vi' be the children of v after the above steps. Let ji, ■ ■ ■ be the annotations of the 
corresponding edges from v to its children. In other words, let ji = max.{{h{v) U {0}) \ h{vi)) for 
i S {1, 2, . . . I'}. Then for every q G X{vi) D A(wfe), where i k and i,k E {1, . . . I'}, do the following. 

(a) If ji < jk, remove q from A(wfc) and from A(m) for all descendants u of v^. 

(b) If ji — jk and M{vi) < A'I{vk)), remove q from X(vk) and from X(u) for all descendants u of Vk- 

5. For every descendant it of w such that X{u) — 0, delete u and all its descendants. 

6. If, after the previous steps, all children of v have annotation h{v), then the process represented by v 
is said to "succeed" locally and needs to be "reset". Delete all descendants of v, so that v becomes a 
leaf node. Additionally, if P(j,{Qh{v)) — True, then update C/ to L/ U {w}. 

7. Update e to the minimum of its previous value and the smallest name among all descendants of v that 
were deleted. 

8. Return t' as the CGS sub-tree rooted at v. 



Note that if h(v) = h{vi), then ji = 0. 



9 



End Algorithm : GeneralizedNextRecursive 



The similarity of algorithms GeneralizedNext and GeneralizedNextRecursive to the corresponding algo- 
rithms in Safra's and Piterman's NSW determinization constructions is striking. Yet, there are important 
differences that enable our construction to achieve something different, and even better Safra's and Piter- 
man's constructions when the number of Streett pairs is large compared to the number of Streett states. 

The computation of 6^(1, a) starts by determining the successors of all ^-states appearing in state 
labels of nodes in the CCS tree t, under the input symbol a. Algorithm GeneralizedNextRecursive is then 
invoked on the resulting tree rooted at root. This recursively "extends" the tree (in Steps ([ij, ([2]) and the 
recursive call in Step ^ of algorithm GeneralizedNextRecursive) by adding new leaf nodes with successively 
smaller hope sets until each leaf node has an empty hope set. As the recursive calls return, algorithm 
GeneralizedNextRecursive) determines in a bottom-up manner which nodes in the extended CGS tree must 
have their hope sets invalidated and/or hierarchical relations modified. We explain below the reasoning 
behind this crucial step in the computation of t' . 

Suppose the hope set of a node v is h{v) and that of its child v" is h{v"). Suppose further that the 
edge from v to v" is annotated with ji, i.e., h{v) \ h{v") = This represents a situation wherein the 

process represented by v is waiting to see gj. in the subset of runs being tracked by its child w", but the 
process represented by v" has given up hope of seeing any further qj^ 's in the runs it is tracking. Now, 
suppose after reading an input symbol a, the initialization step of algorithm GeneralizedNext places qj^ in 
A(z)") (and hence also in A(u)). This implies that v" has seen a state along a run it was tracking, such 
that the corresponding state index is outside its own hope set but is in the hope set of its parent. Since 
every node expects to see all and only states with indices in its hope set in all runs being tracked by it, the 
above situation warrants two actions: (i) invalidating the hope set of v" for the run represented by . , and 
(ii) registering progress towards the realization of w's hope set as the set of state indices in the inf -set of 
the run represented by qj.. Accordingly, qj. is removed from X{v") by the sequence of steps 3(b)i and|4]of 
algorithm GeneralizedNextRecursive. In addition, step 3(b)i creates a new child v' of v with X{v") — {qj^}, 
and annotates the edge from v to v" with the next index (after ji in decreasing order) , say jk , in the hope 
set of V. This represents the new situation wherein the process represented by v has seen qj. and is waiting 
to see the next ^-state in its hope set, i.e. qj^ , in the run (currently) represented by qj. . The new child v' 
however hopes to see no further Qj^'s in the run represented by g^. ; hence its hope set is set to h{v) \ {jk}- 
A special situation arises if qj. is the lowest indexed ^-state in Qh{v)- In this case, node v has seen all states 
with indices in its hope set in the run represented by gj. since the last time v was "reset". The edge from v 
to v' is annotated with a special index, i.e. 0, to represent this situation. The newly created child v' retains 
the same hope set as v, i.e. h{v)), and is now delegated the task of checking if Qh{v) is the inf- set of the 
run currently represented by qj. . Meanwhile, the parent node v continues to check if all states with indices 
in its hope set, i.e. h{v), are seen in the remaining runs (other than the one currently represented by qi) 
that it was tracking. 

A different situation arises if the initialization step of algorithm GeneralizedNext places qj^ in X{v") for 
a child v" of v, but ji is neither the annotation of the edge from v to v", nor is in the hope set of v" . This 
represents a situation wherein the process represented by v was waiting to see some ^-state other than qj. 
next in the runs being tracked by w", and the process represented by v" was expecting to never see qj. in any 
run being tracked by it. Since qj. is in A(w"), the hope set of v" must be invalidated for the run currently 
represented by (jj. . This is done in step 3(b)ii of algorithm GeneralizedNextRecursive by removing qj. from 
the state label of v" and all its descendants. Note, however, that we cannot remove the run represented by 
qj. from the state label of v yet. Although v was not expecting qj. to be the next ^-state in the runs being 
tracked by v", the hope set of v may still contain ji. Therefore, the hope set of v need not be invalidated 
yet for the run corresponding to qj.. As the recursive calls to algorithm GeneralizedNextRecursive return, the 
hope set of v will be examined in turn to determine if a run being tracked by v has encountered a state with 
index outside v's hope set. If so, the run will then be removed from the set of runs being tracked by v. 

Since runs tracked by different nodes in a CGS tree may merge, we may encounter a situation wherein the 
same ^-state q appears in the state labels of multiple nodes that are not related as ancestors or descendants 
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ill the tree. However, by definition, two nodes in a CGS tree can have overlapping state labels only if one 
is an ancestor (or descendant) of the other. Algorithm GeneralizedNextRecursive rectifies this situation by 
ensuring that whenever an ^-state q appears in the state labels of multiple children of a node u, at most one 
child eventually gets to retain q in its state label. The chosen child is the one that represents the maximum 
progress (since v was last reset) towards realisation of the hope set of v as the set of state indices in the 
inf-set of the run represented by q. This choice can be made by examining the annotations on the edges 
from V to the subset of its children containing q in their state labels. Specifically, the child that represents 
the most progress is the one that has the smallest annotation on the edge from v. This is because a child 
with an edge from v annotated with i represents the situation wherein all y^-states with indices greater than 
i and in the hope set of v have been seen since v was last reset. In the event that an y^-state q appears in 
the state labels of two siblings with the same annotation on the edges from their parent, we choose to retain 
q in the state label of the node that was created earlier, i.e. has a smaller name. As the recursive calls 
to algorithm GeneralizedNextRecursive return, step U] examines the nodes of the CGS tree in a bottom-up 
manner and applies the above criterion to ensure that two nodes not related as ancestor and descendant do 
not share any ^-state in their state labels in the final tree. 

Step[S]of algorithm GeneralizedNextRecursive deletes all nodes with empty state labels from the CGS tree 
constructed thus far, since the processes represented by these nodes no longer track any runs. In Step [51 
we examine the annotations on the edges to all children of the current node v. If these annotations are 
all 0, we have a situation wherein all runs being tracked by v have seen all states with indices in w's hope 
set since the last time v was reset. This constitutes a step of progress in establishing that the hope set of 
V is indeed the set of state indices in the inf-set of all runs being tracked by it. Node v is therefore said 
to have "succeeded" locally, and is "reset" in step [5] of algorithm GeneralizedNextRecursive by deleting all its 
descendants. If, in addition, Q^y) |= (f> then we have a step of progress in establishing that Qh(v) is the 
generalized Safra witness set of all runs being tracked by v. Step El of algorithm GeneralizedNextRecursive 
keeps track of this fact by updating the set U . As explained earlier, U is eventually used to obtain the value 
of component / of the CGS tree t' . Finally, step[7]of algorithm GeneralizedNextRecursive updates component 
e of t' by recording the smallest name of a node deleted in the recursive transformation of the CGS tree. 

3.3. An Example 

We now illustrate the working of our determinization construction using the non-determinisic Miiller 
automaton (NMW) A shown in Figure ([1]). The Miiller acceptance condition of this automaton is given 
by = {{qi}}. Let T) be the corresponding deterministic parity automaton obtained by our construction. 




Figure 1: Example non-deterministic Muller automaton 



11 



To see how different states and transitions in T) are obtained, we will follow the construction of states 
encountered in V on reading a short prefix of the word hhhc^ that is accepted by A. Since A has 5 states, 
we have n = 5 and m = 5^ + 5 + 1 = 31. Thus, every node in the CGS tree representing a state of V has 
a name in [31], and a hope set that is a subset of [5]. Every edge in the tree is annotated with an element 
of {0, 1, . . . 5}. Since the hope set of the root node is always [5], and since the hope set of any other node 
V can be obtained by eliminating from [5] the annotations of edges on the path from the root to w, we will 
simply annotate edges with elements of [5] and not explicitly represent hope sets. Similarly, since the state 
label of every node is the union of the state labels of its children, we will simply label leaves of the CGS tree 
with subsets of ^-states. To help illustrate the intermediate steps of the construction, we will also indicate 
the updated values of e and / (components of the CGS tree) in the following discussion. 
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Figure 2: Steps in determinization construction 

We start in the initial state consisting of a CGS tree having a single node named 1 and labeled {qi}, as 
shown in Figure (I2]-ai). The values of e and / are both to + 1 = 32 in this state. On reading the letter 6, 
the state label of the node named 1 (also a leaf in this case) is first changed to {55}, since qi transitions to 
gs on reading b in automaton A. The CGS tree consisting of only the root node is then extended in Steps 
(HI, ^ and through the recursion in Step ([5a|) of algorithm GeneralizedNextRecursive to give the tree shown 
in Figure (l2]-&i). As the recursive calls return in sequence, all nodes other than the ones named 1 and 2 are 
deleted. When the recursion returns to the topmost level with the root named 1 as the current node w, the 



condition in Step (3(b)i) of algorithm GeneralizedNextRecursive is satisfied. Consequently, a new node named 
7 is created as a child of the root, and assigned the state label {(75}. The edge from the root to this child is 
annotated with 4, as shown in Figure (I21-&2)- Subsequently, Step (U) of algorithm GeneralizedNextRecursive 
removes from the state label of the leaf named 2 in Figure ^1)2) ■ This is because the annotation of the 
edge from the root to this node is larger than that of the edge from the root to its sibling having the same 
^-state, g5, in its state label. Removing qc, from its state label causes the leaf named 2 in Figure (I21-62) to 
acquire an empty state label; hence this node is deleted in Step ([SJ of algorithm GeneralizedNextRecursive. 
This gives a tree with only two nodes - a root named 1 and a leaf named 7 with state label {95}. The 
condition in Step (O is not satisfied; hence no nodes are "reset" and U continues to be the empty set. In 
Step (O, the component e finally acquires the value 2, since that is the smallest name of a node that is 
deleted. Once we return from algorithm GeneralizedNextRecursive to algorithm GeneralizedNext, the name- 
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Figure 3: Steps in determinization construction 



compaction step assigns the name 2 to the leaf node that was named 7 earlier. Since no node is reset and 
the set U is empty, the updated value of the component / remains at 32. The resulting CGS tree obtained 
after reading the first b from the input word is shown in Figure ([H-^s). 

On reading the next 5, a sequence of transformations similar to that described above results in a CGS tree 
with a root named 1 and a leaf named 2 with state label {^4} and edge annotation 3. Here too, the component 
e acquires the value 2 and U remains empty, causing / to have the value 32. Figures (12]-ci) to ([ll-ca) illustrate 
the steps in the construction of this CGS tree. 

When the third b in the input word is read, the tree in Figure ([U-ca) is extended in Steps ([T]), ^ and 
through the recursion in Step ([5a|) of algorithm GeneralizedNextRecursive to give the tree shown in Figure 
([5]-di) sans the nodes named 7 and 8. As the recursive calls to algorithm GeneralizedNextRecursive return in 



sequence. Step (3(b)il creates two new leaf nodes (albeit in different recursive calls) named 7 and 8, with 
state labels {qi} and {(75} respectively. The edges from the respective parents to the new leaves named 7 
and 8 are annotated and 4, respectively. The resulting tree is as shown in Figure ([Sl-di), except that the 
node named 6 no longer has qi or q^ in its state label. In fact. Step ^ of algorithm GeneralizedNextRecursive 
removes both qi and q^ (once again, in different recursive calls) from the state label of this node, leaving 
it with an empty state label. Subsequently, this node is removed in Step ([5]), giving the intermediate 
CGS tree shown in Figure (l3]-(i2). Observe that the node named 5 in this tree has the edge to its sole 
child annotated 0. Therefore, this node is "reset" in Step © of algorithm GeneralizedNextRecursive and 
the child named 7 is deleted. Additionally, since the hope set for the node named 5 in Figure ([3]-(i2) is 
{1, 2, 3, 4, 5} \ {3, 5, 4, 2} = {1}, and since {qi} G T , we have f0({9i}) = True. Therefore, 5 is added to the 
set U in Step (O of algorithm GeneralizedNextRecursive. Since the smallest name of a node that is deleted 
is 6, component e finally acquires the value 6 in Step ([7]). Once we return to algorithm GeneralizedNext, the 
name-compaction step renames the leaf node named 8 to 6, as shown in Figure ([3]-c?3). The value of / is 
updated to min(32, 5) = 5. The final CGS tree obtained after reading bbb is shown in Figure ([Sl-da). Figures 
(I3]-p) and ^q) show the final CGS trees (states) obtained after reading bbbc and bbbcc respectively. For 
all subsequent c's that are read from the input word, the CGS tree in Figure ^q) is obtained. Therefore, 
the automaton T) loops infinitely in the state represented by Figure ^q) after reading bbbcc. Note that 
nodes named 5 and 8 are deleted only finitely often but appear as leaves infinitely often in the sequence of 
CGS trees (states) visited on reading the word bbbc'^. Interestingly, the hope sets of the nodes named 5 and 
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8 in Figure ^q) are precisely the inf -sets of the runs of A on the word bbbd^ . As we will see subsequently, 
this is not a coincidence, but a consequence of our construction. 

Let T be the set of all CGS tree over A. The parity acceptance condition for automaton V is V — 
{Fo,Fi,...,Fei), where Fq = {t e T | / = 1, e > 1}, Fs.+i = {t G T | e = z + 2, / > e} for < i < 30, 
F2^+2 = {i G T I / = i + 2, e > /} for < i < 30, and i^6i = {* G T I e, / > 31}. If we let p denote the run 
of V on the word bbbd^, then clearly m/(p) n Fg $, while m/(p) n i^^ = for < i < 8. Therefore, bbbd^ 
is accepted by T). 



4. Proof of Correctness 

Let A = (S, Q, Qo, S, (p) be an w-automaton with acceptance condition based on infinity sets, and let V 
be the corresponding DPW obtained by our construction. Let a G be an w-word, and let p = ^0*1^2 • ■ • 
be the unique run of T) on a. Here, U = {Ni, Mi,ri,pi, Xi,hi,ei, fi) is the state (tree) of V reached after 
reading the prefix a{0,i — 1) of a. 

We will first show that if to is a CGS tree, as defined in Section p.2|) . then every ti, for i > 0, in p is also 
a CGS tree. From algorithms GeneralizedNext and GeneralizedNextRecursive, it is easy to see that if ti is a 
rooted tree with nodes labeled by subsets of Q and annotated with subsets of [\Q\], then so is ti+i, for all 
i > 0. Since e^+i and /i+i are initialized to m + 1 = jQP + IQI + 2 and subsequently updated to the smaller 
of their respective current value and the name of a node in i^+i, it follows that e^+i and /i+i are always 
in [IQp + IQI + 2]. Given these observations, it suffices to show the following three additional properties of 
ii-i-i in order to establish that is indeed a CGS tree. 

1. There are no more than \Q\'^ + \Q\ + 1 nodes in i^+i. Since the name-compaction step of algorithm 
GeneralizedNext ensures the absence of gaps in the set of names eventually assigned to nodes of ti+i, 
proving the above property guarantees that the range of the naming function Afi-|_i is indeed jQp + 
IQI + 1. We will defer the proof of this property to Section 

2. The (hope-set) annotation of every node in t^+i is contained in the annotation of its parent, and 
differs by atmost one element from that of its parent. In addition, every non-leaf node v in t^+i has 
at least one child with an annotation that is a strict subset of hi^iiv). The first property is proved 
in Lemma © below. The second property is a consequence of Lemma ^ and Step ^ of algorithm 
GeneralizedNextRecursive. 

3. The state label of every node in ti+i is the union of state labels of its children in t^+i. In addition, the 
state labels of sibling nodes in ii+i are mutually disjoint. We will prove the first property in Lemma 
([3]) below. The second property is a consequence of Step (|4]) of algorithm GeneralizedNextRecursive and 
the fact that no step of algorithm GeneralizedNextRecursive adds any element to an already existing 
state label of a node. 

Lemma 2. For every z > and for every node u and its child v in ti, hi{v) C hi{u) and \hi{u)\hi{v)\ < 1. 
Proof: We will prove the lemma by induction on the indices oi to,ti, . . .. 

Base Case: For the tree to with only the root node tq, the claim in the lemma holds vacuously since there 
are no nodes with children in tQ. 

Hypothesis: We assume that the claim in the lemma holds for ti, where i > 0. 

Induction: Consider the tree i^+i obtained by applying algorithm GeneralizedNext to ti. From the pseudocode 
of algorithms GeneralizedNext and GeneralizedNextRecursive, we observe that the hope set of a node in t^+i 



can be updated only in Step ([5]) or Step (3(b)i) of algorithm GeneralizedNextRecursive. In both these steps, 
the node whose hope set is updated is a newly created node that is added as a child of the current node. 
Now let V be an arbitrary node in t^+i. We consider two cases below. 

• Suppose V £ NiO Ni-^-l. Thus, v was present in ti and was not deleted in the process of transforming 
ti to ti^i. Since deletion of a node (Step ([5]or Step ^ of algorithm GeneralizedNextRecursive) entails 
deletion of all descendants of the node as well, the fact that v was not deleted implies that no ancestor 
of V was deleted either in the process of transforming ti to t^+i. Thus, both v and its parent, say u, in 
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ii+i were present in ti, and neither of them was newly created in Step ^ or Step (3(b)i) of algorithm 
GeneralizedNextRecursive during the transformation oiti to i^+i. Hence, hi+i{v) = hi(v) and hi^i{u) = 
hi{u). By the induction hypothesis, we already know that hi{v) C hi(u) and \hi{v) \ hi{u)\ < 1. The 
inductive claim now follows immediately. 

Suppose V was newly created in the process of transforming ti to i^+i. Since new nodes can be created 
only in Step ^ or Step (3(b)i) of the recursive algorithm GeneralizedNextRecursive, v must have been 



created in one of these steps. From the pseudocode of algorithm GeneralizedNextRecursive, it is easy 
to see that both these steps set hi^i{v) to hi+i{u) \ {k}, where u is the parent of v in ti+i and 
k G {0} U hi+i(u). It follows that hij^i{v) C hi+i(u) and \hi+i{v) \ hi+i{u)\ < 1. 

Therefore, by the principle of mathematical induction, the claim in the lemma holds for all t.^, where 
i> 0. □ 



Lemma 3. For every i > and for every non-leaf node v in ti 
v' e N,,v ^pi{v')}. 



= U'eyA^K), where V = {v' 



Proof: We will prove the lemma by induction on the indices of ^i: ■ • ■• 

Base Case: For the tree to with only the root node tq, the claim in the lemma holds vacuously since there 
are no non-leaf nodes in to . 

Hypothesis: We assume that the claim in the lemma holds for t;, where i > 0. 

Induction: Consider the tree obtained by applying algorithm GeneralizedNext to ti. Since the claim in the 
lemma holds for ti (by induction hypothesis), and since the initialization step of algorithm GeneralizedNext 
replaces the state label of every node v with 6{Xi{v), a), it follows that the state label of every non-leaf 
node continues to be the union of state labels of its children even after the initialization step. Since no 
nodes are added or deleted, and the state labels of no nodes are changed in Steps ^ and ^ of algorithm 
GeneralizedNext (i.e., during name-compaction and updation of component /), the inductive claim can be 
proved by establishing that Step © of algorithm GeneralizedNext does not violate the claim. This amounts to 
showing that algorithm GeneralizedNextRecursive preserves the property that the state label of every node is 
the union of state labels of its children. Wc therefore focus on the steps of algorithm GeneralizedNextRecursive 
below. 

Clearly, Step ^ of algorithm GeneralizedNextRecursive preserves the desired property. Although Step 
results in the creation of a new child v' of v, the desired property is preserved, since the state label of 

new children may be created for w, but the union of state labels of 



is set to that of v. In Step (3(b)i 



children of w remains unchanged. This is because for every new child v' that is created. Step (3(b)il sets the 
state label of v' to {q}, where q is in the state label of an already existing child of v. Step (3(b)ii| presents 
a more interesting situation. Let Vk be a child of v such that the annotation on the edge from v to Vk is 
jk- From Lemma ^ and from the definition of edge annotations, we know that hi^i{v) = hi^i{vk) U {jk}- 
If a state q in the state label of Vk is such that q ^ q^^ and q ^ Qhi^-^(vk)i Step (3(b)ii| of algorithm 



GeneralizedNextRecursive removes q from the state label of Vk and from the state labels of all its descendants. 
This can give rise to a situation wherein q is in the state label of v (parent of Wfe) but not in the state 
label of any child of w, potentially violating the property that the state label of every node is the union of 
state labels of its children. However, such a violation is only temporary and is rectified by the time the 
recursion of algorithm GeneralizedNextRecursive terminates. To see why this is so, notice that since q ^ qj^ 
and q ^ Qhij^i{vk)) must have q ^ Qhi+i(v) — Qhi+i{vk) U {ijk}- Hence, when the recursion of algorithm 
GeneralizedNextRecursive returns to the level where the current node is the parent u of node v in t^+i, we 
have two possibilities. 

1. Suppose q — qr, where r is the annotation of the edge from it to w in i^+i. In this case. Step (3(b)i) of 
algorithm GeneralizedNextRecursive creates a new child v' of u with state label {q}, and with an edge 
annotation that is smaller than r. This eventually causes q to be removed from the state label of v in 



Step (3(b)i) of algorithm GeneralizedNextRecursive. 

Suppose q ^ qr, where r is the annotation of the edge from w to u in ti 



Step (3(b)iil of algorithm GeneralizedNextRecursive removes q from the state label of v. 



Since q ^ Qhi+i{v) as well. 
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Therefore, if q is removed from the state label of a child Vk of v by Step ( 3(b)ii ) of algorithm GeneralizedNextRecursive, 
then it is also eventually removed from the state label of v. This ensures that the desired property of the 
state label of a node being the union of state labels of its children is eventually preserved. Step ^ of 
algorithm GeneralizedNextRecursive can remove a state q from the state label of a node, but only if q is 
also present in the state label of a sibling node. Hence, Step ^ cannot change the union of state labels of 
children of a node. Step ([5]) deletes nodes with an already empty state label, while Steps ^ and ([7]) do not 
modify the state label of any node. Step © can cause a non-leaf node to turn into a leaf node, but this 
does not affect the desired property, which relates only to non-leaf nodes. 

Thus, if algorithm GeneralizedNextRecursive is invoked on a tree in which the state label of every node is 
the union of state labels of its children, the algorithm preserves this property after it has transformed the 
tree recursively. This, coupled with the inductive hypothesis, implies that t^+i satisfies the inductive claim. 

Therefore, by the principle of mathematical induction, the claim in the lemma holds for all ti, where 
i > 0. □ 

CGS trees encountered along a run of T) have several interesting properties that are useful in proving 
the correctness of our construction. We will prove these propeties below by considering an arbitrary run 
p — tQtit2 ... of 2? and by inductively showing that the respective properties hold for every CGS tree ti along 

P- 

Proposition 4. For every i > 0, for every v € Ni and for every q G K{v), there is a run of the automaton 
A from some qo G Qq to q on the prefix a(0, i — 1). 

Proof: We will prove this by induction on the indices oi to,ti, . . .. 

Base Case: For the tree to with only the root node tq, the claim in the lemma holds trivially, since 
Ao(fo) — Qo by definition. 

Hypothesis: We assume that the claim in the lemma holds for ti, where z > 0. 

Induction: Consider the tree ti+i obtained by applying algorithm GeneralizedNext to ti. We know from 
the initialization step (Step ([T]) of algorithm GeneralizedNext that the state label of r^+i is initially set to 
S{Xi{ri),ai). We also know from the pseudocode of algorithm GeneralizedNextRecursive that invoking this 
algorithm on a CGS tree rooted at a node v does not change the state label of v. Since Step ^ of algorithm 
GeneralizedNext invokes algorithm GeneralizedNextRecursive on the CGS tree rooted at ri+i, the state label 
of Tj+i remains unchanged at 6{Xi{ri),ai) after the call to GeneralizedNextRecursive returns. Subsequently, 
neither Step ^ nor Step ^ of algorithm GeneralizedNext changes the state label of any node in ti+i. 
Therefore, Ai+i(ri+i) = 5{Xi{ri), ai). Now let v be an arbitrary node in ti^i and let q G Xi^i{v). By 
Lemma ([3]), we know that q G Xi+i{ri+i) — 6{Xi{ri),ai). By the inductive hypothesis, for every q' G Xi{ri), 
there is a run of A from some go G Qo to q' on the prefix a{0,i — 1). Therefore, there is a run of A from 
some qo £ Qo to q e S{Xi{ri),ai) on the prefix a{0,i). 

By the principle of mathematical induction, the claim in the lemma holds for all i > 0. □ 

Lemma 5. For every i > and for every v d Ni such that v is a non-leaf node of ti, we have hi{v) ^ 0. 

Proof: From Lemma ([2]) and Step ^ of algorithm GeneralizedNextRecursive, it follows that if w is a non-leaf 
node of ti, it must have a child u' such that hi[v') is a strict subset of hi{v). This immediately implies that 
h,{v) □ 

Lemma 6. Let m = jQp -I- IQj -I- 1. For every i > 0, if fi < m + 1, there exists a leaf node v in ti with 
name Mi{v) — fi such that hi{v) ^ 0. 

Proof: We will prove the lemma by induction on the indices of to,ti .... 

Base Case: For the CGS tree to with only the root node tq, the claim in the lemma holds vacuously since 
fo =m + 1. 

Hypothesis : We assume that the claim in the lemma holds for ti, where i > 0. 

Induction : Consider the CGS tree obtained by applying algorithm GeneralizedNext to ti. The value 
of /i+i is set in Step ^ of algorithm GeneralizedNext to the smaller of m + 1 and the smallest name of a 
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node added to the set U in Step © of algorithm GeneralizedNextRecursive. Therefore, if /i+i < m + 1, a 
node V with Mi+i{v) = /i+i must have been added to the set U in Step ([5]) of a recursive call of algorithm 
GeneralizedNextRecursive. Furthermore, v must have been the root node of the CCS sub-tree transformed by 
this specific recursive call. The condition in Step ^ of algorithm GeneralizedNextRecursive requires that all 
children of v must have their hope set equal to hi^i(v) (or alternatively, the annotations on all edges from 
V to its children must be 0) . Therefore, v must have been a non-leaf node prior to being "reset" in Step ^ 
of algorithm GeneralizedNextRecursive. We now consider two cases below depending on whether the node v 
was present in ti or not, and show that hi+i{v) ^ in both cases. 

• Suppose w € A'i n NiJ^\. By the argument given in the proof of Lemma ([5]), we know that hiiyV) = 
hi+i{v). If V was a non-leaf node in ti, by Lemma ([5]), hi{v) ^ 0. Hence, hi+i{v) ^ as well. If v 
was a leaf node in ti, we could either have hi{v) = or hi{v) ^ 0. In the latter case, we easily get 
hi^i(v) = hi(v) 0. In the former case, we note that v cannot become a non-leaf node prior to Step 
(O of algorithm GeneralizedNextRecursive in the process of transforming ti to ti+i. This is because Step 
((TJ of algorithm GeneralizedNextRecursive prevents any children from being added to v if hi{v) — 0. 
Therefore, hi{v) must have been non-empty in ti, and the claim in the lemma follows. 

• If w is newly created in the process of transforming ti to ti+i, then by the argument used in the 



proof of Lemma v must have been created either in Step ^ or in Step (3(b)i) of algorithm 



GeneralizedNextRecursive. If v was created as a leaf node in Step (3(b)i), it could not have become 



a non-leaf node prior to execution of Step This is because algorithm GeneralizedNextRecursive 



is not called recursively on any leaf node created in Step (3(b)i). If v was created as a leaf node in 
Step the only way it could have become a non-leaf node prior to execution of Step © is by a 
recursive invokation of algorithm GeneralizedNextRecursive on this node in Step ([3]). However, Step ((!)) 
of algorithm GeneralizedNextRecursive ensures that such a recursive invokation adds a child to v only 
if the hope set of v is non-empty. Therefore, we must have hi^i{v) ^ 0. 

Since node v is "reset" and all descendants of v are deleted in Step © of algorithm GeneralizedNextRecursive, 

V becomes a leaf node at the end of Step ([5]). Furthermore, since ti and t^+i are trees, every node has a 
unique parent in ti and ii+i, and hence, algorithm GeneralizedNextRecursive is recursively invoked at most 
once on a node in Step It follows that after node v is "reset" and turned into a leaf by a recursive call of 
algorithm GeneralizedNextRecursive, there are no subsequent recursive calls to GeneralizedNext with v as the 
root of a CGS subtree to be transformed. From the pseudocode of algorithm GeneralizedNext, we note that 
this implies that no child gets added to v after it is "reset". Therefore, v either remains as a leaf node in ti^i 
or is subsequently deleted in the process of transforming ti to t^+i. However, since /i+i is set to Mi^i{v), 
we know from Step ^ of algorithm GeneralizedNext that v is present in Ni+i. Therefore, u is a leaf node in 
ti+i with Mi+i{v) = fi+i and hi+i{v) ^ 0. 

By the principle of mathematical induction, the claim in the lemma holds for all ti, where i > 0. □ 

Lemma 7. Let a be an uj-word and let p = t^ti . . . he the unique run ofD on a. Let i, k be indices and let 

V be a node such that: (i) i < k, (ii) for all z € + l,...fc}, node v is present in tz and hz{v) ^ 0, and 
(Hi) node v is a leaf in both ti and tk, and is a non-leaf node in all tz, where i < z < k. Then the following 
claims hold. 

1. Node V is "reset" in the process of transforming tk-i to tk- 

2. For every q' € Afc(w), there is a q Xi{v) such that there is a run ip of A on a{i, fc — 1) with 'ip{0) ~ q, 
ip(k — i) = q' and ip{z — i) & Xziv) for all z Q {i,i + I, . . . fc}. 

3. For every run tp of A on the word segment a{i, fc— 1) such that ip(z—i) G Xz{v) for all z G {i, i+1, . . . fc}, 
all states in Qh-^^) are visited in -0. 

Proof: 

1. We will prove this claim by contradiction. If possible, suppose v becomes a leaf node in tk without 
being "reset" in the process of transforming tk-i to t^. Consider the case when fc = i + 1. Since u is a 
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leaf in and hi{v) ^ 0, Step Q of algorithm GeneralizedNextRecursive creates at least one child of v 
with the same non-empty state label as that of v when GeneralizedNextRecursive is invoked with v as 
the root of the CCS subtree to be transformed. If A: > z + 1. then since t; is a non-leaf node in ffc-i, 
there is at least one child of v with a non-empty state label in tk-i- By Lemma Q, the state label 
of V in this case is also the union of state labels of its children in tk-i- Thus, in either case, there is 
an intermediate step during the transformation of t^-i to tk when v has one or more children with 
non-empty state labels, and the union of state labels of its children equals the state label of v. All 
these children must eventually be deleted before v becomes a leaf node in tk- 

From the pseudocode of algorithm GeneralizedNextRecursive, we note that the only steps that delete 
nodes from a CGS tree are Step ([5]) and Step Since v exists in tk and is assumed not to have been 
"reset" in the process of transforming tk-i to tk, its children could not have been deleted in Step ©. 
Therefore, all its children must have been deleted in Step ^ of algorithm GeneralizedNextRecursive. 
This requires all children of v to acquire the empty state label. We know from above that there exist 
one or more children of v with non-empty state labels in an intermediate step during the transformation 
of tk-i to tk- The state labels of all such children must therefore be emptied before they can be deleted 
in Step ([5]). From the pseudocode of algorithm GeneralizedNextRecursive, the only steps that remove 
states from the state labels of nodes are Step (3(b)iil and Step Unfortunately, State ^ simply 
removes duplicates from the state labels of siblings, and cannot render the state labels of all children 
of V empty. Therefore, Step (3(b)iil must eventually be responsible for emptying the state labels of all 
children of v- However, we know from the proof of Lemma ([3]) that if a state is removed from the state 
label of a child of v in Step (3(b)ii) of algorithm GeneralizedNextRecursive, then that state is eventually 
removed from the state label of v as well. Since the state label of v equals the union of state labels 
of all its children at an intermediate step in the transformation of tk-i to tk, the above implies that 
all states in the state label of v must eventually be removed in the process of transforming tk-i to tk- 
This, in turn, implies that v is removed from tk in Step ([5]) of algorithm GeneralizedNextRecursive - a 
contradiction! 

2. Since node v is present in all tz, for z e {i, i + 1, . . . k}, it follows from Step ^ that Xr{v) is always 
initialized to (5(Ar-i(w), Q^-i), for re {i + 1, . . . fc}. Since no other step of algorithm GeneralizedNext 
or algorithm GeneralizedNextRecursive adds states to the state label of an already existing node, the 
claim now follows from an easy induction on z € + 1, . - - k}- 

3. From the pseudocodes of algorithms GeneralizedNext and GeneralizedNextRecursive, we note that since 
node V exists in t^ for all 2; S + - - - k}, the hope set of v must stay unchanged, i.e., hi{v) = hz(y) 
for all z € + Now let r be an arbitrary index such that i < r < k- Suppose node 
V has a child v' in a (possibly intermediate) step of algorithm GeneralizedNextRecursive during the 
transformation of tr to tr+i- Suppose further that the edge from v to v' is annotated with j and the 
state label of v' is S in this step. We will first prove the following claim. 

Claim 1: For every run if^ of A on a{i, r) such that 'ip[z — i) G Xz{v) for a// z G {i, . . . r — 1} and 
^(r — i) € S , all states in {qn, qn-i, ■ - ■ , Qj+i} H Qhi(v) visited in ip- 
The proof is by induction on 

Base Case: We know that w is a leaf node in ti with hi{v) ^ 0. Therefore, during the transformation 
of ti to ti+i, Step (0) of algorithm GeneralizedNextRecursive creates a child v' of v and adds all states 
in 5{\i{v),ai) to the state label of v' - In addition, the edge from v to v' is annotated with j = 
max(/ii(u)) > 0. This implies that {qn, qn-i, - ■ - , (lj+i}^Qhi{v) — 0- Hence, the claim follows vacuously. 
Suppose additional children of w are subsequently created in Step (3(b)i I of algorithm GeneralizedNextRecursive. 
Since u is a leaf in ti, it can be seen from the pseudocode of algorithm GeneralizedNextRecursive 



that prior to execution of Step (3(b)i), v could have had only a single child - the one created 



in Step ([2]), with the edge from v to this child annotated with j = max(/ii(w)). In order for a 



new child of v, say v" , to be created in Step (3(b)i), we note from the pseudocode of algorithm 



GeneralizedNextRecursive that the state label of v" must be {qj} and the annotation of the edge from 
V to v" must be I — max({0} U {hi{v) n {l,2,...j — 1})). Since j = max(/i,;(w)), it follows that 
{Qmln-i, ■ - ■ n Qhi{v) = {tj}- Since the state label of v" is also {qj}, the claim is easily seen 

v" . Since no other step of algorithm GeneralizedNextRecursive or algorithm GeneralizedNext 
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adds any state to the state label of u", this proves the base case of the induction. 
Hypothesis: We assume that the claim is true for r, where i < r < k — 1. 

Induction: Consider the transformation of tr+i to tr+2- Let v' be a child of v in some step of algorithm 
GeneralizedNextRecursive during this transformation. Suppose further that the edge from v to v' is 
annotated with j and the state label of v' is S in this step. We consider two cases below. 

• If v' is present in tr+i, then by the argument used in the proof of Lemma ([2]), v must also 
have been present in with hr+i{v) = /ir+2(w) and hr+i(v') = hr+2iv'). Therefore, by the 
definition of edge annotations, the edge from v to v' must have been annotated with j in t^+i 
as well. Step ([T]) of algorithm GeneralizedNext ensures that the state label of v' is initialized 
to 6{Xr + l(w),Q!r+i) during the transformation of t^+i to This, along with the inductive 
hypothesis, and the facts that hr+i{v) = hr+2{v) and the edge annotations from v to v' are the 
same in tr+i and in imply that the claim holds for v' after the initialization step during 
the transformation of t^+i to tr+2- Since no other step of algorithm GeneralizedNextRecursive or 
algorithm GeneralizedNext adds any state to the state label of v' , this proves the inductive claim 
for v' . 

• If v' is not present in tr+i, it must have been created as a child of v in Step ([2]) or in Step 



(3(b)i| of algorithm GeneralizedNextRecursive during the transformation of tr+i to tr+2- Since 
i < r + \ < k (by the condition in our inductive hypothesis), we know that w is a non-leaf node in 
tr+i- Therefore, v' could not have been created in Step ^ of algorithm GeneralizedNextRecursive 
(this step requires w to be a leaf node in tr+i{v)). Hence, v' must have been created in Step 



(3(b)i 



From the pseudocode of algorithm GeneralizedNextRecursive, we note that when v' is created as a 



child of V in Step ( 3(b)i ), the state label of v' is set to {qj^}. where jx is the annotation of the edge 
from V to an already existing child Vx , and qj^ is in the state label of at the time of creation 
of v'. In addition, the annotation of the new edge from v to v' is set to / = max({0} U {hr+2{v) H 
{1,2, ... jx — I}))- Since w is a non-leaf node in tr+i, the child Vx itself could not have been 
created in Step ^ of algorithm GeneralizedNextRecursive during the transformation of t^+i to 



tr+2- It could not have been created in Step (3(b)i) of algorithm GeneralizedNextRecursive either 



since Step (I3bl) of algorithm GeneralizedNextRecursive iterates over the children oiv existing prior 
to execution of Step Therefore, the child Vx of v must be present in tr+i- 
Since v and Vx are present in both tr+i and in the intermediate CGS tree at the time of creation 
of v' , the hope sets of v and Vx, and the annotation of the edge from v to Vx must be the 
same in t^+i and in the intermediate CGS tree. This implies that the edge from v to Vx is 
annotated with jx in t^+i- By virtue of Step ((IJ of algorithm GeneralizedNext, we also know 
that there is a state q' e Xr+i{vx) such that qj^ e 5{q' , ar+i). This, along with the inductive 
hypothesis, and the facts that hr+i{v) = hr+2{v) and the annotation of the new edge from v to v' 
is I — max({0}U(/ir+2('y)n{l, 2, . . . jx — 1})), imply that for every run ip oi Aoiia{i, r+1) such that 
^(z-i) e Xz{v) for z e {i,. . .r} and ip{r + l-i) = qj^, all states in {g„,(j„_i, . . . , qi+i}nQhr.+2{v} 
are visited. 



From the pseudocode of algorithm GeneralizedNextRecursive, no step other than Step (3(b)i) adds 



any state to the state label of v' after it is created in Step (3(b)il. Therefore, v' has at most one 
state, qj^, in its state label in any intermediate step of algorithm GeneralizedNextRecursive during 
the transformation of tr+i to tr+2- We have already considered the case of qj^ in the state label 
of v' above. Hence, this proves the inductive claim for v' and also completes the proof of Claim 
1. 

To complete the proof of Lemma ([THS]), we note from Lemma ([7]-refclaimla) that v is "reset" during 
the transformation of tk^i to tk. Therefore, from Step ^ of algorithm GeneralizedNextRecursive, v 
must have had at least one child with non-empty state label prior to being "reset". In addition, the 
annotations of all edges from v to its children with non-empty state labels must have been prior to 
the resetting of v. It then follows from Claim 1 that for every run ^ of ^ such that ^{z — i) (z Xz{v) 
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for all z Cz {i,k — 1} and ijj{k — i) is in the state label of some child of v prior to it being reset, all 
states in {qn, . . . qi} D Qhi{v) = Qhi(v) visited in ■0. 

This does not prove Lemma yet, since we must show the above result for ~ i) ^ Afc(w). 

We have seen earlier, in the proof of Lemma ([3]), that the state label of a node v may temporarily 
contain states that are not in the state labels of any of its children after intermediate steps of algorithm 
GeneralizedNextRecursive. However, we also saw in the same proof that all such states are eventually 
removed from the state label of v after all recursive invokations of GeneralizedNextRecursive have 
returned. Therefore, proving the claim of Lemma ([2]l3]) for ip{k — i) in the state labels of children of v 
prior to v being "reset" proves Lemma ([7]|3]) itself. 



□ 



Lemma 8. Let a he an Lo-word and let p 
every node v in ti, Xi{v) C Qhi(v)- 



toil 



be the unique run ofD on a. For every i > and for 



Proof: We will prove this claim by contradiction. Suppose there exists an i > and a node v in ti such that 
qi G Xi{v) although I ^ hi{v). Clearly, v cannot be the root, r^, of ti, since hi{ri){— holra] = [n]) contains 
the indices of all states of A. Therefore, v must have a parent, say u, in ti. Recalling that tg has only a 
single node (i.e., rg) without any parent, we can immediately infer that i > 0. In other words, there exists 
a CCS tree ti-i such that ti is obtained by applying algorithm GeneralizedNext to ti-i. 

From the pseudocode of algorithm GeneralizedNextRecursive, we observe that during the transformation 
of ti^i to ti, the only nodes in ti on which the recursive algorithm GeneralizedNextRecursive is not recursively 
invoked are those that are generated in Step (3(b)i). Furthermore, every node generated in Step (3(b)il is 
either deleted or survives as a leaf in the transformation of ti to t^+i. Since node w is a non-leaf node in 
ti, algorithm GeneralizedNextRecursive must have been invoked with u as the root of the CGS subtree to be 
transformed, during the transformation of to ti. 



Let j be the annotation of the edge from m to u in ti 
separately below. 



There are two possibilities that we consider 



• Suppose V is created during the transformation of ti^i to ti. This can happen either in Step (0) or 
of the recursive invokation of algorithm GeneralizedNextRecursive with u as the root of 



in Step (3(b) 



the CGS subtree to be transformed. 



If V is created in Step (3(b)i ), it follows from the pseudocode of algorithm GeneralizedNextRecursive 
that Xi{v) — {qi}, where /(> 0) is the annotation of an edge from u to an already existing child, 
say v" , of u. In addition, hi{v) is set to hi{u) \ {max ((/ii(u) U {0}) n {0, 1,2, ... ,1 ~ 1})}. By 
the definition of edge annotations, I e hi{u) \ hi{v") and hence / € hi{u). It then follows that 
I G h^{u) \ {max((/ij(M) U {0}) n {0,1,2, ... ,1 - 1})} = h,{v) as weU. Therefore, \i{v) C 
Since no other step of algorithm GeneralizedNextRecursive adds any state to Xi{v) subsequently, 
we have Xi{v) C Qhi{v)- This gives us a contradiction! 



— If w is created in Step then Step (3(b)i) must subsequently be executed in the same recursive 
invokation of GeneralizedNextRecursive with u as the root of the CGS subtree to be transformed. 
This is similar to the case considered below wherein v exists in ti-i, and Step (3(b)il is executed 
in the recursive invokation of GeneralizedNextRecursive with u as the root of the CGS subtree to 
be transformed. 

• Suppose V exists in ti-i. It follows that the parent, u, of v must also exist in ti^i. Consider Step 
(3(b)i| in the recursive invokation of algorithm GeneralizedNextRecursive with u as the root of the 
CGS subtree, during the transformation of ti-i to ti. We have two sub-cases to consider. 



If j = I, a. new child, say v' , of u is been created in Step (3(b)il, the state label of v' is set to 



{qi} and the edge from u to v' is annotated with an index < /. This implies that in Step ^ of 
algorithm GeneralizedNextRecursive, qi is removed from the state label of v. Since no other step of 
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algorithm GeneralizedNextRecursive adds states to Xi{v) subsequently, it follows that qi ^ Xi{v). 
This gives a contradiction! 



— Suppose j ^ I. Since I is also not in hi{v), it follows that in Step (3(b)ii| of the recursive 
invokation of GeneralizedNextRecursive with u as the root of the CCS subtree to be transformed, 
qi is removed from Xi{v). By the same argument used above, qi cannot be subsequently added to 
Xi{v). Hence, qi ^ Afc(u) - a contradiction again! 

We have therefore shown that there is no i > and no node v in ti such that qi £ Xi{v) and I ^ hi{v). □ 
Armed with the above properties of CCS trees encountered along a run of V, we will now show that the 
languages accepted by V and A are the same. As before, let a be an w-word in L{'D) and let p — toti . . . 
be the unique run of V on a. By definition of the acceptance condition for 2?, there exists an even index 
2a + 2, where < 2a + 2 < 2m — 1, such that CCS trees from the parity acceptance set F2a+2 are seen 
infinitely often along p, while CCS trees from all parity acceptance sets Fy, where < y < 2a + 2, are seen 
only finitely often along p. Let i* be the smallest index (> 0) such that all CCS trees ti for i > i* are 
outside Uo<j/<2a+2 following lemma describes important properties of the suffix ti,ti+i, ... of p, 

where i > i*. 

Lemma 9. Let i and i' be indices such that (i) < i* < i < i' , (ii) both ti and tii are in F2a+2, Cind (Hi) 
tz ^ F2a+2 for all z Cz {i + 1, ■ . ■ i' — 1} ■ Then there exists a node v such that the following hold. 

1. V is present in tz for all z Cz {i,i + 1, . . . i'}. In addition, Mz{v) = a + 2 and hz{v) — hi{v) ^ for all 
z £ {i, i + 1, . . . i'}. 

2. V is a non-leaf node in tz, for all z £ {i + \, . . A' — 1}. 

3. For every state q' £ Xi>(v), there is some state q £ A,;(u) such that there is a run of A from q to q' on 
a{i,i' — 1) that visits all and only states in Qhi{v)- 



Proof: 



Since both ti and ti' are in F2a+2, it follows from the definition of even-indexed parity acceptance sets 
that fi = fi'=a + 2. Also, since < 2a + 2 < 2m — 1, we have 1 < a + 2 < m. Therefore, by Lemma 
([5]), both ti and ti' contain a leaf node with name a + 2 and with a non-empty hope set. 
Since i* < i < i', it follows from the definition of i* that for all z £ {i,i + 1, . . .i'}, the CGS tree tz 
is not in Uo<2;<2a+2 Recalling the definitions of F^ for odd and even indices x, we see that this 
implies > a + 2 for all z e {i, i + 1, . . . i'}. Hence no node with name < a + 2 is removed in the 
process of transforming ti to ti+i, tj+i to and so on until tii is obtained. Therefore, the node 
V with name a -I- 2 in ti continues to be a part of all tz, where i < z < i'. Since ez > a -|- 2, the 
name-compaction step of algorithm GeneralizedNext keeps the name of node v, i.e, a -t- 2, unchanged 
in all of tz- Hence, node v is present in tz and Mz{v) = a -f 2, for all 2 S {i, i -I- 1, . . . i'}. Furthermore, 
since hi{v) 7^ and since v is not deleted in the sequence of transformations from ti to ti' , it follows 
that hz{v) = h,{v) 7^ 0, for i < z < i' . 

Consider an index z such that i < z < i' . liv was a non-leaf node intz-i, then it starts off as a non-leaf 
node with at least one child having a non-empty state label when algorithm GeneralizedNextRecursive is 
invoked on tz~i to transform it to tz- liv was a leaf node in tz-i (as is the case when z — i+1, for exam- 
ple), then since hz-i{v) ^ (by Lemma ^1) above), Step ^ of algorithm GeneralizedNextRecursive) 
ensures that v becomes a non-leaf node with at least one child having a non-empty state label in an 
intermediate step during the transformation of tz-i to tz- Thus, in either case, v becomes a non-leaf 
node with at least one child having a non-empty state label in some intermediate step of algorithm 
GeneralizedNextRecursive. 

In order for v to subsequently become a leaf node in tz, all its children must be deleted. Deletion of 
nodes can only happen in Step ([5]) or Step ([6]) of algorithm GeneralizedNextRecursive. We show that 
none of these steps can delete all children of v in tz ■ 

• Since v stays back in tz (by Lemma ^1) above), if the leaves of v are deleted in Step ^ of 
algorithm GeneralizedNextRecursive, v must be "reset" and Mz{v) = a + 2 must be added to U 
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(since P4,{Qh^(v)) = P<t>{Qhi(v)) ~ True) in Step Therefore, must be set to a value no larger 
than a + 2 in Step @ of algorithm Generalized Next. Since > a + 2 (as shown in the proof of 
Lemma ©-1), this would imply that ^ F^, where < x < 2a + 2. Recalling the definition of 
i*, this contradicts the fact that z > i > i* . 

• If all leaves of v are deleted in Step ^ of algorithm GeneralizedNextRecursive, then the union 
of state labels of the children of v must be empty at some intermediate step of algorithm 
GeneralizedNextRecursive. We have seen above in the proof of Lemma that the state label 
of a node is eventually no larger than the union of state labels of its children at any intermediate 
step of algorithm GeneralizedNextRecursive. Therefore, if all leaves of v are deleted in Step ^ 
of algorithm GeneralizedNextRecursive, the state label of v must eventually become empty in tz- 
However, v must then be deleted from by Step ([5]) of algorithm GeneralizedNextRecursive. This 
contradicts Lemma ^1) proved above. 

Therefore, v must be a non-leaf node in tz- 
3. Lemma ^3) is an immediate consequence of Lemmas ^l), ^2), ([Till]), (17][31) and ([5]). 

□ 

Lemma 10. L{V) C L{A). 

Proof: We will prove this lemma by constructing a finitely branching infinite tree K along the lines of 
Safra's proof of correctness of his NSW determinization construction, and by showing the existence of an 
infinite accepting path of A in this tree. 

The vertices of K are elements of {r} U (Q x N), where r is a special vertex representing the root of K. 
For every qq G Qqi we draw an edge from r to (go, 0). As defined earlier, let i* be the minimum index after 
which no CGS tree from Fx-, for a; < 2a + 2, is visited in the sequence Iq, ti, . . .. Let ii be the smallest index 
greater than i* such that /^^ — a + 2, and let v be the node in ti-^ identified in Lemma ([Sl-l). From Lemma 
^l), we know that Mi{v) = a + 2 and hi{v) = hi-^ (v) ^ for all i > ii. For every state q in A^j (v) we add 
a vertex {q, ii) to the tree K. For every such state q, Proposition Q tells us that there is a state qq € Qo 
such that there is a run of A from go to q on a(0, ii — 1). We add an edge from {qo, 0) to (q, ii) in tree K for 
every such qo G Qo ^nd q G A^^ {v). Subsequently, we extend the tree K inductively as follows. Given a tree 
with a leaf (g^, iz), where qk € Xi^ (v) and iz > ii is such that fi^ — a + 2, we find the smallest iz+i > iz such 
that fi^j^^ = a + 2. Since CGS trees in i^2a+2 are encountered infinitely often in tg, ii, ■ • • (by the acceptance 
condition of V), such an iz+i always exists. For every state q' € Ai^_^j(w), we now add a vertex (q',iz+i) 
to the tree K. By Lemma (HJ-S), there is a state q in A^^ (v) such that there is a run of A from q to g on 
a(iz,iz+i — 1) that visits all and only states in Qhi-^(v)- For every such q' G K^+iiv) and q G Xi^(v), we add 
an edge from {q,iz) to {q',iz+i) to extend the tree K. It is easy to see that K is an infinite tree with the 
branching of each node {q,iz) restricted by the cardinality of \^^i{v). i.e. \Q\. Therefore, it follows from 
Konig's lemma that there is an infinite path in K . 

From Proposition (|4|), every edge ((go,0), {q',ii)) corresponds to a run of A on a(0, ii — 1) that starts 
at go and ends at g'. From Lemma ([ni-3), every edge [[q,iz), (g',iz+i)) for z > 1 corresponds to a run of A 
on a{iz,iz+i ~ 1) that starts at g and ends at g' and visits all and only states in Qhi-^{v)- Therefore, the 
infinite path in K identified above corresponds to a run p oi A that starts from some go G Qo and eventually 
visits all and only states in Q/i-^ (v). In other words, inf{p) ~ Qhi-^(v)- Furthermore, since /^^ = a + 2 and 
Mi^{v) = a + 2, we must have P,p{Qhi^(v)) = True. In other words, inf{p) |= 0, and hence p is an accepting 
run of A. This implies a G L{A). □ 

Lemma 11. L{A) C L{V). 

Proof: Consider an w-word a G L{A). Let — qkai1kn<lk2 ... be an accepting run of A on a, and let 
P = ta,ti,t2 ■ ■ ■ be the unique run of V on a, where ti is the CGS tree (Ni, Mi,ri,pi,li,hi, fi,ei). Con- 
sider the transformation of ti to ti+i by algorithm GeneralizedNext. Step ([T]) of algorithm GeneralizedNext 
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updates the state label of r; to S{Xi{ri),ai). Subsequently, no step of algorithm GeneralizedNext or algo- 
rithm GeneralizedNextRecursive deletes any state from the state label of r;, deletes r^, or adds as the 
child of any other node. It therefore follows from an easy inductive argument that the root of ti even- 
tually survives as the root r^-i-i of for all i > 0. Since Mo(ro) = 1 and ho{ro) — [n], and since every 
node in ti that is not deleted in transforming ti to ti+i retains its name and hope set in i^+i, we have 
Mi_|_i(ri-|_i) = 1, /ii_|_i(rij ) = [n] and e^+i > 1 for all i>0. Also, by definition, eo = 1 > 1. Therefore, 
ej > 1 for all i >0. 

Let J be the set of indices of all states in m/(-!/)), i.e., J = {j \ qj € inf{tlj)}. Let ii be the smallest 
index such that for all i > ii, we have g^. G inf{'ip). We wish to identify those nodes v in t^ that have 
i^{z) € )^z{y)i for all z > ii + 1. In other words, we wish to identify nodes in the sequence of CGS trees 
ti,+i, ti,+2, . . . that track the run tp ol A from position -f 1 onwards. 

We have already seen above that tq survives as the root node in all CGS trees in p. We also know that 
-'^o(''o) = Qoj by definition. Since Step ([T]) of algorithm GeneralizedNext updates Ai+i(ri+i) to 5{\i{ri)^ai) 
for all i > 0, and since no subsequent step during the transformation of ti to ti^i deletes any state from the 
state label of the root r^+i, it follows from an easy inductive argument that i}^{z) G Xz{rz), for all z > 0. 

Now suppose the root node becomes a leaf infinitely often in p{ii + l,oo). Let j and j' be arbitrary 
indices such that ii + I < j < j', and the root node is a leaf in tj and t'^, but not in any t^, for j < z < j'. 
Since we also know that hi{ri) ~[n] ^% for all i > 0, it follows from Lemma ([7][3]) and Lemma that the 
set of states visited in ip{j,j') is exactly Qhi{ri) — Q[n]- By repeating the same argument for all successive 
pairs of indices j, j' such that ii -\- 1 < j < j' , and the root node is a leaf in tj and t'j, but not in any tz 
in between, we get inf{ip) = Qhiin): for every i > ii. Since is an accepting run of A., we also know that 
P^{inf{i/j)) = True. This implies that Pcf,{Qhi{v)) = True for all those indices i > ii where becomes a leaf 
node in p{ii + 1, oo). By Lemma ([7][T]), we know that r.; is "reset" in these steps as well. Hence is added to 
the set U in Step ^ of algorithm GeneralizedNextRecursive during the transformation of ti^i to ti for each 
such i. Since the root has the smallest name {Mi{ri) = 1), the component fi of the CGS tree ti is set to 1 
infinitely often, while Ci > 1. Hence the set Fq is visited infinitely often and w e L{'D). 

If the root node becomes a leaf finitely often, there is an index 12 > ii such that the root node is a non-leaf 
node in all tz for z > 12- By Lemma ([3]), we know that for all z > 12, every state in Xz{rz) is also in Xz{v) for 
some child v of r^. Since ip{z) S Xz{rz) for all z > 0, it follows that for all z > 12, there is a child v of such 
that 'ip{z) € Xz{v). Now consider the transformation oitz to tz+i for z > 12, and let Vz be the node in tz such 
that ip{z) e Xz{vz)- Step ^ of algorithm GeneralizeNext initializes the state label of with S{Xz{vz), az), 
thereby placing il;{z + l) in the state label of Vz- Subsequently, if ijj{z + l) is moved out of the state label of U2, 
either Step (3(b)iil or Step ^ of algorithm GeneralizedNextRecursive must be responsible for this. However, 



if ip{z + 1) is removed from the state label of Vz in Step (3(b)ii), from the argument used in the proof of 



Lemma ([3]), we know that 'il}{z + 1) must eventually be removed from the state label of the parent of Vz in 
tz+i, i.e. from the state label of Vz+i- This is a contradiction, since ip{z) € Xz{rz) for all z > 0. Therefore, 
if "0(2: -I- 1) is removed from the state label of Vz, Step @ of algorithm GeneralizedNextRecursive must be 
responsible for the removal. From the pseudocode of GeneralizedNextRecursive, we now observe that if Vz+i 
is the new node containing ip{z + 1) in its state label in tz+i, then either Mz+i{vz+i) < Mz+i{vz) = Mz{vz) 
or the annotation of the edge from r^+i to Vz+i in tz+i is lesser than the annotation of the edge from Tz+i 
to Vz in tz+i- Since both rz{— fz+i) and Vz existed in tz^ the annotation of the edge from Tz+i to Vz in tz+i 
must be the same as the annotation of the edge from to Vz in tz- Therefore, if the child of the root that 
tracks -0 changes from tz to tz+i, then either the name of the node reduces or the annotation of the edge 
from the root to this node reduces during the transformation from tz to tz+i- Since neither the name nor 
the annotation can decrease infinitely, there must be an index 13 > 12 such that for all z > i^, the child of 
the root that contains ^l^{z) in its state label has the same name and the same annotation of the edge from 
the root to this child. In other words, if Vz and Vz+i are children of the root in tz and tz+i respectively such 
that ipi^z) e Xz{vz) and ip{z + 1) € Xz+i{vz+i), then Mzivz) = Mz+i{vz+i) and hz{vz) = hz+i{vz+i)- 

If possible, let Vz and Vz+i be distinct nodes. As seen above. Step (U) of algorithm GeneralizedNextRecursive 
is responsible for moving ^{z -\- 1) from the state label of Vz to that of Vz+i during the transformation of 
tz to tz+1- From the pseudocode of algorithm GeneralizedNextRecursive, we note that either the annotation 
of the edge from the root to Vz+i must be less than the annotation of the edge from the root to w^, or the 
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name of v^+i must be less than the name of Vz at the time of execution of Step Since the name of 
Vz+i can only reduce further during the name-compaction step and since the annotation of the edge from 
the root to Vz+i cannot change subsequently in any step of algorithm GeneralizedNextRecursive or algorithm 
GeneralizedNext, we cannot have both the names and the annotations of the edges from the root identical 
for Vz in tz and for Vz+i in tz+i- Since z > is, this gives us a contradiction! Therefore, Vz is the same node 
as Vz+i for all z > i^. Since Mz{vz) also stays unchanged for all z > i^, no node with name < Mz{vz) is 
deleted during the transformation of tz to tz+i, for z > i^. This implies that Cz > Mz{vz) for all z > i^. 

We now claim that hz{vz) ^ for all z > 13. To see why this is so, suppose hz{vz) = for some 2 > 13 
and let j be the annotation of the edge from to Vz in tz- Consider Step pb|) of the recursive invokation 
of algorithm GeneralizedNextRecursive with the parent of w^, i.e. r^, as the root of the CGS subtree to be 



transformed. Let qi be a state in the state label of Vz when Step (j3bl) is executed. If I = j, then Step (3(b)i 
creates a new sibling v' of w^, sets the state label of v' to {qi] and sets the annotation of the edge from r 
to v' to an index < I. Since no further step removes the state label of the newly created leaf u', state qi 
gets removed from the state label of Vz in Step ^ of algorithm GeneralizedNextRecursive. If, on the other 



hand, I 7^ j, then since hz{vz) is assumed to be 0, Step (3(b)ii) removes qi from the state label of Vz- Thus, 
in either case, no state eventually remains in the state label of Vz in tz if hz{vz) = 0. This implies that Vz 
is deleted from tz in Step ([SJ - a contradiction! Therefore, we must have hz{vz) ^ for all z > i^. 

We now consider the case where the node Vz becomes a leaf infinitely often in p^i^ + 1, 00). By using the 
same argument as used above when the root becomes a leaf infinitely often, we find that for every z > such 
that Vz is a leaf in tz, the node Vz is added to the set U in Step © of algorithm GeneralizedNextRecursive 
during the transformation of tz-i to tz- Therefore, fz < Mz{vz) for all z > 13. We have also seen above 
that Cz > Mz{vz) for all z > ^3. This implies that a parity acceptance set with an even index x is visited 
infinitely often by the run p of V. Hence w G L{T>). 

If Vz becomes a leaf only finitely often in ^(13 + l,oo), we can repeat the same argument as used above 
and show that there is an index 14 > 13 and a child v' of Vz such that (i) v' is present in ti, (ii) ip{i) G Xi{v'), 
(iii) hi{v') = hij^i{v') 7^ 0, and (iv) Mi{v') = Afi+i(u'), for all i > 14. Since all CGS trees ti have height < n 
(as argued in Section (O), by continuing the above argument, we find that there must exist an even index 
X such that is visited infinitely often by p. In other words, w E L{T)). □ 

Theorem 12. L{D) = L{A) 

Proof: Follows from Lemmas (fTU)) and (fTTj) . □ 



5. Complexity 

Theorem 13. Given an automaton A with n states, the deterministic parity automaton T) constructed above 
has at most n*^*^" ^ states and 0{n^) parity acceptance sets. 

Proof: The computation for the number of states of the automaton T) is similar to that done by Piterman 
for his NSW to DPW construction Since every state of 2? is a CGS tree over A, we will count the total 
number of CGS trees over A below, assuming n = \Q\ and m = n'^ + n + 1. 

The salient steps in counting the number of CGS trees over A are as follows. 

• Since the state labels of leaves in a CGS tree are pair-wise disjoint, and since every leaf has a nonempty 
state label, there can be at most n leaves. 

• If we collapse the vertices at the head and tail of every 0-annotated edge in a CGS tree, we will get 
a tree with no 0-annotated edges. Since the hope set of the root is always [n] and since the hope set 
of a child in the collapsed tree misses exactly one index from the hope set of its parent, the height of 
the collapsed tree can be at most n. This, along with the fact that there are at most n leaves, implies 
that there are at most + 1 nodes in the collapsed tree. 
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• To count the nodes that were removed due to the coUapsing operation described above, we note that 
each node in the original CGS tree must have a path (possibly of zero length) to a leaf such that 
each edge along this path has a non-0 annotation. Hence, if u and v are nodes such that the edge 
from the parent of u to u and that from the parent of w to w are both annotated with 0, the path of 
non-0 annotated edges from m to a leaf cannot overlap with the corresponding path from w to a leaf. 
Therefore, there can be atmost n nodes in a CGS tree such that the edges from the respective parents 
to these nodes are annotated with 0. This implies that the total number of nodes in a CGS tree can 
be atmost m — + n + 1. 

• By construction, the parent of a node always has a smaller name than the node. Thus the parenthood 
relation can be represented by a sequence of at most m — 1 names where the name is a value in 
{1, ... i — 1}. For a tree with k nodes, the there are at most < (fc — 2)! such sequences of length fc — 1. 
Considering all trees with number of nodes in {1, . . . m}, there are at most E^j^(A: — 2)!, i.e. < (m — 1)! 
such sequences. Hence, there are at most as many named trees where children have larger names than 
their respective parents. 

• The state label of a node is given by the union of state labels of leaves in the sub-tree rooted at that 
node. In addition, the labels of leaves are pairwise disjoint. Therefore, the state labels of all nodes in a 
tree can be obtained by associating each ^-state with the leaf that contains it in its state label. Since 
leaves in a tree may not be named with the first few contiguous names, we sort the leaves by names 
and then use a mapping from ^-states to positions of leaves in this name-sorted order. If an y^-state 
doesn't appear in any leaf, we associate the position with it. Thus, the number of state labelings of 
a named tree is at most the number of mappings Q — >■ {0, 1, . . . n}, i.e. < (n + 1)". 

• The (hope set) annotation of a node is represented using edge annotations as follows. Suppose the 
hope set of a node v is h{v) and that of its child v' is h{v'). Then the edge from v to v' is annotated 
with h{v) \ h{v'), if h{v') C h{v), and with if h{v') = h{y). By properties of CGS trees, h{v') C h{v) 
and \h{v') \ h{v)\ < 1. Therefore, the edge annotation is a unique element in [n] U {0}. Similarly, the 
hope set for every node is uniquely determined if the annotations of all edges are given. Specifically, 
the hope set of a node is simply [n] sans the annotations on edges along the path from the root to 
this node. Therefore, it is sufficient to count the number of edge annotation functions to obtain the 
count of hope set annotations of nodes. Each edge can be identified by the name of the node it points 
to. The total number of edge annotation functions is then easily seen to be the number of functions 
[m] [n] U {0}. This is bounded above by (n + 1)'". 

• For the acceptance condition, we need to know the value of e when e < f, and the value of / when 
/ < e. Thus we need to keep track of at most 2m values. 

Combining the above counts, the total number of CGS trees over A is at most 

(m - 1)! • {n + 1)"+™ • (2m) = n'='^""> 
. The number of parity acceptance sets is 2m = 2 • (n^ + n + 1) = 0{n^). □ 



6. An improved upper bound for w-automata 

The determinization construction proposed above gives a DPW starting from a variety of different non- 
deterministic automata, all of which have an acceptance condition based on infinity sets. By Theorem p3p . 
the number of states of the DPW is at most while the number of sets in the parity 

acceptance condition is at most 0{n^), where n is the number of states of the original automaton A. This 
bound also holds when the input automaton is a pairs automaton viz. a Streett or a Rabin automaton. This 
is significant since the size of the output DPW, both in terms of number of states and acceptance pairs, is 
independent of the number of pairs of the input pairs automaton. This is different from the case of Safra's 
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determinization construction for NSW[il|[ll|[ll|, where the output DRW/DPW has at most 20("''i°g("'')) 
states and 0{nh) pairs, where n and h are the count of states and pairs, respectively, of the input NSW. 

This naturally leads us to ask if 2'^^" is a better bound than 2'-'("'*'°s("'!.)) fQj. determinization of 

NSW/NRW. The answer to this question is not immediately obvious and requires us to show that there 
are indeed examples of NSW/NRW with 0{n) states and h pairs for which Safra's and Piterman's NSW 
determinization construction will end up constructing automata with state count worse than 20(" '"S"). In 
the following, we present a class of such automata. In the case when h>n^, where fc > 1, this immediately 
implies an improved worst case complexity bound on NSW/NRW determinization. 

Theorem 14. There exists a family As of NSW where each NSW As G As has 3n + 1 states and 2" + 1 
accepting pairs for which the Safra-Schwoon (Piterman) construction constructs a DRW (DPW) with 2^^(" ) 
states, while our construction (algorithm GeneralizedNext) constructs a DRW/DPW with 2*^'" log") states. 

The proof of Theorem (jl4p is given in Subsection (16.11) by demonstrating the construction of an automaton 
from the family As. 

To begin with, a strategy to generate more than 20(" i°s") states for the DRW/DPW using Safra's/Piterman's 
construction is established. The input NSW for such a strategy has 0{n) states and /i = 2" pairs. One 
way to generate a sufficiently large number of {Q,H)-trees (as used in Schwoon's exposition of Safra's con- 
struction) is to obtain different permutations of the edge labels on a path from a leaf to the root, and then 
repeat this for all paths in the tree. We shall follow the construction of Schwoonfigj described in algorithms 
SafraNext and SafraNextRecursive (see Subsection for NSW determinization. 




{qiqi....,q„} Ul] {92} ■ ■ ■ {9..} 

(Q,H)-tree t^. (Q,H)-tree i. 



Figure 4: Steps in construction of counter-example 

Figure ^ shows three possible {Q, i/)-trees, tx, ty and that can be generated using the Safra-Schwoon 
construction in algorithm SafraNextRecursive starting from the initial tree t^, where to is the CGS tree with 
a single (root) node tq, with A(ro) = Qo, M{ro) = 1, /i(ro) = [n] and for ta we have e = / = rri + 1. 

The first tree tx is not hard to generate, since Steps ([T]) and ^ recursively extend a (Q,_ff)-tree at its 
leaves. If the Streett state label of the leaf node in the first tree tx is {gi, q2, ■ ■ ■ , Qn} and Qi £ Fh for all 



i G [n], then in Step (3(b)il a new node is created for each such qi G Fh with the edge from the root node 
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Figure 5: Example transformation of (Q, H)-trees 



to the newly created node annotated h, giving the second tree ty. An apphcation of Steps ([T]) and © will 
result in the extension of the second tree ty at its leaves giving the third tree t^- For each Streett state g^, 
i G [n] that appears in the label of a leaf node in the third tree t^^ the path from the leaf to the node is 
disjoint from every other path in the tree. Each such disjoint path has exactly the same edge annotations. 
Note that since the number of leaves in a {Q,H)-tree can never be more than the total number of Streett 
states, we cannot expect to get more than n disjoint paths from a leaf to the root. The challenge now is to 
permute the edge annotations giving a large number of (Q, _ff)-trees. 

Since, the maximum length of a disjoint path in a {Q,H)-tTee depends on the number of pairs of the 
NSW, one would like to start with an NSW with as many pairs as possible. Suppose, we start out with 
h = 2" pairs in the NSW. A permutation of 2" edge annotations would give us (2")! possible trees with just 
one branch and ((2")!)" trees with all n disjoint branches. With only n states in the NSW and 2" pairs, it 
is clear that one or more Streett states will be replicated across pairs. This replication of Streett states is a 
potential problem as the example in Figure ([S]) shows. 

Figure ^ shows different edge annotations for a path of length h (with no edges) in a (Q, H)-tree. 
The different edge annotations are obtained as the Streett state label at the leaf changes. We assume that 
/i = 2". It is not hard to obtain the edge annotations along (^2) from the edge annotations along (pi). 
In this transformation only the edge annotation of the first edge in (pi) changes from 2" to 2" — 1. This 
is possible if there is a state qk in the leaf label that is also in the pair i?2" of the pair (£'2™ , i^2" ) ■ This 
causes the entire path to be replaced by a pair of nodes - the root node with exactly one child. The edge 
between the root and its child node is annotated 2" — 1. This path is again extended by Steps ([T]) and ^ of 
algorithm SafraNextRecursive. We see that repeated application of this change allows us to change the edge 
annotations of (^2) to those shown along (ps), where the edge annotation on the edge from the root to the 
first child node is 2" — fc. Note that this requires that the NSW has a path from qk back to itself on some 
letter or word segment. Once the first edge annotation is fixed we can apply a similar set of transformations 
using some other state qi to fix the second edge annotation to 2" — /. But, this immediately implies that 
the state qi cannot be in £'2"-/c or i^2"-fc since that would either change the annotation of the first edge to 
2" — fc + 1 or reset the path back to the third path (p^) shown in the figure. Hence, every time we fix the 
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edge annotation for an edge it constrains the possible pairs that a Streett state can belong to. With only 
n states and 2" pairs, we are soon forced to repeat Streett states across pairs in our example NSW. This 
in turn forces already fixed edge annotations to change, defeating our purpose. Thus, generating arbitrary 
permutations of 2" pair indices along paths in a (Q, iJ)-tree is extremely hard with an NSW with just n 
states. 

We then ask if 2" is too many pairs and try to see if n or or some number of pairs polynomial in 
n allows us to achieve our objective of obtaining arbitrary permutations of edge annotations. But, with 
n*^ pairs in the NSW, for some constant fc, even if obtaining arbitrary permutations of edge annotations is 
possible, we can obtain at most (n*^)! permutations along a path and hence ((n*^)!) (Q,i/)-trees using all 
the paths. But, ((n*"')!)" is 2*^*^" log")^ which matches the bound given by our construction and does not 
serve our purpose. 

We now show a solution to the above dilemma. We start out with h ~ 2" pairs in the NSW, but we parti- 
tion the 2" pairs into [^J blocks of n pairs each. Hence Bi = ((L2", C/2"), (^2"-!, t^2'>-i), • ■ • , (i2"-(n-i)- C^2"-(n-i))) 

is the first block, B2 = ((i2^_(„), C/2"-(n)), (i2"-(n+l), f^2"-(n+l)), • • • , (-^2"-(2n-l), C^2"-(2n-l))) IS the 

second block and so on. If [^J = k, then the last or fc*'' block is Bk = (i2"-((fe-i)n)i t^2"-((fc-i)n)): 
(L2"-((i;-i)«+i), t^2"-((fc-i)n+i)), ■ • ■ , (-^^2"- (fcri-1) , t^2"- (fcn- 1) ) • Instead of trying to generate arbitrary per- 
mutations of 2" pair indices we try to generate permutations of only n pair indices, but with the following 
properties for a permutation (ji, j2, . . . ,jn), where ji e [h] for all i e {1, 2, . . . , n}. 

• We pick k — [^J blocks starting with the last block Bk and picking successively lower numbered 
blocks Bk-i,Bk-2, ■ ■ ■■ 

• From each block we pick exactly one pair index. For example if we pick the i^^ pair in block Bj then 
pair is (L2(j_i)„+(i_i), C/2(j_i)„+(i_i)). We call this pair index idx}. 

• If pair index idx* is already picked from block j, then we do not pick idx; for I ^ j. for every pair of 
blocks Bj and Bi that are picked. 

This system of picking elements of the permutation not only allows us to permute only n elements along 
every path from a leaf to the root, but also allows us to choose from 2" Streett pairs and at the same time 
have only 0{n) states for the example NSW. We shall see later that this method ends up generating more 
than 2'-'*^" log") (^Q^ iJ)-trees. We shall call a permutation that satisfies the conditions described above as a 
block permutation of size n. An example of a NSW with 0{n) states and 2" pairs for which the corresponding 
DPW constructed using the Safra/Piterman construction has more than 20("'i°s") states is given below. 

6.1. An example showing improved worst case bounds 

Consider the the NSW As = (S, Q'', (/q, (5'*, T) defined as follows. The NSW .4s is an automaton in the 
family As described in Theorem (jl4p . 

• is the state set containing 3n + 1 states {q^} U {qQ,±,qi^±, . . . ,qn-i,±}U { qo.s^qi.s, . . . , qn-i.s } 
U{qo,T, 9i,T, • ■ • , 'J'n-i.T}- States of the form qi^±,qi^sTQi,T are called _L-states, s-states and T-states 
respectively. 

• qo is the initial state. 

• E is the alphabet {oq} U {a^^s \ x e {0, 1, 2, . . . , n — 1}} U {ao, . . . , a„_i} U {aj_}- 

• The transitions for the automaton are defined as follows 

1. 5^{qo,ao) = {(7(o,t), g(i,T), • ■ • , g(n-i,T)} 

2. (5^(g(i_T),aj_) = 9(i,T) for aU i S {0, 1, . . . , n - 1} 
3- (5*(g(i,T),ai) = for alH e {0, 1, . . . , n - 1} 

4. '5'^(g(j_^),a(j_^)) = q^^^^) for all i,j e {0, 1, . . . , n - 1} 
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Figure 6: Example transformation of (Q, H)-trees 
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• There are 2" + 1 Streett pairs T = {(i^-i, (iJi, Fi), (iJa, Fa), . . . , (£;2- , Fa^)}, where F,,E,CQ', 

for all i G {— 1, 0, 1, . . . , 2" } satisfying the following constraints 

1- U(o,T),-- • ,9(n-i,T)} ^ -^2" and g(j_T) ^ £'2" for aU i G {0, 1, . . . , n - 1}. 

2. ^(i.i) ^ for all i € {-1, 0, 1, . . . , n - 1} and for aU j e {1, 2, . . . , 2"}. 

3. (7(,.^) ^ Fj for aU i e {0, 1, . . . , n - 1} and for all j e {1, 2, . . . , 2"}. 

4. {9(0,±),---,9(n-l,_L)} ^ F_i 

5- = F2"_™-i for all r e {0, l,...,fc- 1} and ^ F2«_r„_j for aU j e {0, 1, . . . , n - 1} 

and j 7^ i. 

6. {q{i.s)} ^ P(2"-rn-t) for all r G {0, 1, . . . , fc — 1} and for ah t G {0, 1, . . . ,ri — 1}. 

As discussed earlier our goal is to permute n pair indices chosen carefully from different blocks. For 
example let Bi = (2" — 2n — 1, 2" — 5ri — 1, 2", 2" — 3n — 4, . . .) be a block permutation of size n. Our goal 
is to start with an arbitrary assignment of edge annotations along a path in a (Q, H)-tTee and obtain the 
permutation Bi along that path. We do not insist that the elements of Bi appear along successive edges 
along the path, but we insist that they appear along the path in the same order as they appear in Bi. 

Figures ([5]), ([7]) and ([5]) demonstrate the main steps in the process of generating the required permu- 
tations of pair indices for the example automaton. In Figure starting from the initial (Q,iJ)-tree 
consisting of just the root node, we obtain the tree extended at the root and with Streett state label 
{Q{o,t)tQ{i,t)t ■ ■ ■ T<l{n-i,T)} using the transition from go on letter ao and Steps ([1]) and ^ of the Safra- 
Schwoon construction. This single path changes to the branched tree in which the root has n children with 
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Figure 7: Example transformation of path tti 



the edge to each child annotated 2" and the i child has Streett state label (/(i-i.r)- Using a sequence of 
transitions on the letters a(o.o) and a{\^s) we obtain the final tree that has n leaves and n disjoint paths, one 
from each leaf to the root node. 

Note that the letter a(o,o) causes only state (7(0, t) to change to the next state g(o,s)) while Streett state 
labels for all other leaves remain unchanged. This results in the edge annotation between the root and the 
leftmost child to change to 2" — 1. On reading the letter a(i,s), state <Z(o,s) changes to giving us the 

tree t\ in the figure. Note that t\ is only an intermediate tree and will evolve through different steps of 
the Safra-Schwoon algorithm. We observe that by changing the Streett label of just one path at a time we 
can systematically generate permutations of edge annotations one path at a time. This will be our general 
strategy henceforth and we shall see how a path tti in tree t\ evolves with succeeding steps. 

The T-states can be thought of as the source states of every path transformation. We change a T-state 
to an s-state only along the path whose edge annotations we need to modify. 

Figure ([7]) shows the transformations of path tti in order to obtain the block permutation Bi in order 
along the edges in tti. It is straightforward to obtain the first element 2" — 2n — 2 along the first edge. All 
it requires is successive applications of letter aji to a(,„_i_<,) follows by a(i,s)- We now try and change the 
other edge annotations keeping the first edge annotation fixed. On reading the letter a(o,s)i we change 
the second edge annotation to 2" — 2. Here, we need to be careful, since an application of a(2,s) at this point 
will change 2" — 2 to 2" — 3 but it will also change 2" — 2ri — 2 to 2" — 2n — 3, because of the way the Streett 
pairs are organised. Hence, we defer the application of 0(2, s) and instead apply letter a(o,s) again, which 
changes 2" to 2" — 1. Now an application of g) will change 2" — 1 to 2" — 3, since 2" — 2 already appears 
on the edge above. Using this general strategy of deferring the application of a letter if it changes an edge 
annotation that is already on an edge above and part of Bi , we can obtain the required block permutation 
Bi along path tti. Note that it is possible that all elements 2" — rn — 1, for all r G {1, 2, fc}, where k is 
the number of blocks may appear between the first element 2" — 2n — 2 and the second element 2" — 5n — 1 
of Bi in order. 

Once all elements of Bi appears along tti, we "seal" path tti, by applying the letter a_L, which affects 
only (7(i.s) at the leaf of tti and does not affect the T-states on the other paths. After this the state {q{i,i_)) 
and hence the edge annotations for tti do not ever change. We now apply a(o,i) to change 9(i,t) to q(o.s) 
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at the leaf of the second path. We then use our usual strategy discussed above to obtain another block 
permutation along that path. Continuing this way we can obtain arbitrary block permutations of size n 
along every path in {Q,H) trees. 




2" - 1 



{9(i,T)} 



■ ^(..-l.T)} 




{9{l,T)} 



Figure 8: Example transformation of path tti 



Although, we consider only special types of (Q, iJ)-trees, where the paths of the trees are disjoint from 
one another, we shall show that this is sufficient to generate enough trees to go beyond the 2*^*^" log") upper 
bound given by our construction. 

There are k — [2^J blocks of Streett pairs, with n elements in each block. Note that if 2" mod n ^ 
i.e n is not a power of 2, then some pairs may not appear in any block, but this does not affect our 
construction. Also, the pair is not considered at all and serves only as a placeholder for the _L- 

states. Consider a block permutation B — (2" — aon, 2" — ain — 1, 2" — a2n — 2, . . . , 2" — a„_in — n— 1), where 
fli, . . . , a„_i € {1, . . . , fc}. Each element 2" — aiU — j, for all i, j e {0, . . . , n — 1} can be chosen from one of k 
blocks. There are n! ways of ordering the blocks themselves. Hence there are nl x fc" ways of choosing a block 
permutation in each branch. Since, we consider (Q, iJ)-trees that always have n disjoint branches/paths 
there are (n! x fc")" ways of choosing block permutations in all branches. But, {nl x fc")" — (nl)" x fc" . 
Since k = ^ and Stirling's approximation gives us nl — f2((^)"), this is equal to r2(2^) x (-^^^r) or 
which is 2^'^"^^. Hence, the Safra-Schwoon construction generates 2^^"^^ (Q, 7f)-trees, which are states of 
the DRW, while our construction gives a bound of 2*^*^" log") the number of states of the constructed 
DPW/DRW. Since, the bounds for the Safra-Schwoon construction are obtained by counting (Q,H)-trees 
without names, the same bounds work when constructing a DPW from an NSW using compact (Q,H)-trees 
as described by PitermanpTj. 

Hence, its has been effectively demonstrated that our construction for determinization of oj-automata 
using generalized witness sets, results in an improved worst case complexity bound for NSW determinization 
when the number of pairs of the NSW is /i = 2". Since, our construction constructs deterministic parity 
automata and complementing parity automata is trivial, the same arguments can be used to show an 
improved upper bound for NRW determinization. 

In the following we show another interesting consequence of our construction. We show a new lower 
bound on the number of states of any w-automaton accepting a given w-regular language. Interestingly, this 
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lower bound on the number of states is a function of the Rabin index of the w-regular language. 

7. A new lower bound for w-automata 

demonstrate a new lower bound on the number of states of any w-automaton that uses an acceptance 
condition based on infinity sets to accept a given w-regular language L. Interestingly, this lower bound is a 
function of the Rabin index of the w-regular language. The Rabin index of an w-regular language is defined 
as follows. 

Definition 15 (Rabin Index). Let C{k) he the set of all u-regular languages that are accepted by DRW 
with k or less number of pairs. For any uj-regular language L the smallest k such that L G C{k) is called the 
Rabin index of L. 

Wagner [22| and Kaminski[8| showed that the Rabin index is a property of an w-regular language and not 
of the deterministic pairs automaton accepting the given language. They also provided a characterization 
of the Rabin index in terms of structural properties of deterministic automata accepting a given w-regular 
language. We provide below a lower bound on the number of states of any w-automaton that uses an 
acceptance condition based on infinity sets and accepts an w-regular language with a given Rabin index. 

Tlieorem 16. Given an uj-regular language L with Rabin index k, any uo-automaton (deterministic or non- 
deterministic) that uses an acceptance condition based on infinity sets and accepts L must have at least 
\/k — 1 states. 

Proof 17. Proof : Let A be an uj-automaton with n states that uses an acceptance condition based on 
infinity sets and accepts L. Using the construction of Section h3.2]] . we can obtain an equivalent DP W with 
at mostn^^" ^ states and2-(n^ +n+l) parity acceptance sets. This DPW can be interpreted as an equivalent 
DRW with the same number of states and at most rt^ + rt + 1 Rabin acceptance pairs. By definition of Rabin 
index we must have + n + \ > k. R follows that n > \fk — 1. □ 

8. Conclusion 

In this paper, we presented a new construction for determinization of w-automata whose acceptance 
condition is based on the notion of infinity sets. We extended the Safra/Piterman construction for NSW 
determinization using the concept of generalized witness sets to construct an equivalent DPW. We demon- 
strated, by way of an example, that there are families of NSW with Oin) states and 2" pairs for which our 
construction gives a DPW with better worst case complexity bounds than the Safra/Piterman construc- 
tion. Effectively, we have improved the worst case complexity for NSW/NRW determinization. Also, there 
is no known direct determinization procedure for NMW; every known procedure uses an indirect method 
by first translating the NMW to either an NSW or an NEW and then using determinization on it. Our 
method provides a direct determinization construction for NMW. As an easy corollary of our construction, 
we demonstrate a new lower bound on the number of states of an cj-automaton accepting a given w-regular 
language, as a function of the Rabin index of the language. 
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